In this notebook, some template code has already been provided for you, and you will need to implement additional functionality to successfully complete this project. You will not need to modify the included code beyond what is requested. Sections that begin with '(IMPLEMENTATION)' in the header indicate that the following block of code will require additional functionality which you must provide. Instructions will be provided for each section, and the specifics of the implementation are marked in the code block with a 'TODO' statement. Please be sure to read the instructions carefully!
Note: Once you have completed all of the code implementations, you need to finalize your work by exporting the Jupyter Notebook as an HTML document. Before exporting the notebook to html, all of the code cells need to have been run so that reviewers can see the final implementation and output. You can then export the notebook by using the menu above and navigating to File -> Download as -> HTML (.html). Include the finished document along with this notebook as your submission.
In addition to implementing code, there will be questions that you must answer which relate to the project and your implementation. Each section where you will answer a question is preceded by a 'Question X' header. Carefully read each question and provide thorough answers in the following text boxes that begin with 'Answer:'. Your project submission will be evaluated based on your answers to each of the questions and the implementation you provide.
Note: Code and Markdown cells can be executed using the Shift + Enter keyboard shortcut. Markdown cells can be edited by double-clicking the cell to enter edit mode.
The rubric contains optional "Stand Out Suggestions" for enhancing the project beyond the minimum requirements. If you decide to pursue the "Stand Out Suggestions", you should include the code in this Jupyter notebook.
In this notebook, you will make the first steps towards developing an algorithm that could be used as part of a mobile or web app. At the end of this project, your code will accept any user-supplied image as input. If a dog is detected in the image, it will provide an estimate of the dog's breed. If a human is detected, it will provide an estimate of the dog breed that is most resembling. The image below displays potential sample output of your finished project (... but we expect that each student's algorithm will behave differently!).

In this real-world setting, you will need to piece together a series of models to perform different tasks; for instance, the algorithm that detects humans in an image will be different from the CNN that infers dog breed. There are many points of possible failure, and no perfect algorithm exists. Your imperfect solution will nonetheless create a fun user experience!
We break the notebook into separate steps. Feel free to use the links below to navigate the notebook.
Make sure that you've downloaded the required human and dog datasets:
Download the dog dataset. Unzip the folder and place it in this project's home directory, at the location /dog_images.
Download the human dataset. Unzip the folder and place it in the home directory, at location /lfw.
Note: If you are using a Windows machine, you are encouraged to use 7zip to extract the folder.
In the code cell below, we save the file paths for both the human (LFW) dataset and dog dataset in the numpy arrays human_files and dog_files.
import matplotlib.pyplot as plt
%matplotlib inline
import numpy as np
from glob import glob
import os
import time
from tqdm import tqdm
#from time import time
import copy
import cv2
# load filenames for human and dog images
# human_files = np.array(glob("/data/lfw/*/*"))
# dog_files = np.array(glob("/data/dog_images/*/*/*"))
# have copied the images to my local workspace to allow for removing corrupted images,
# so set up directory variables to allow for loading from either original or workspace
orig_dir = "/data/"
work_dir = "/home/workspace/data/"
project_dir = "/home/workspace/dog_project/"
human_files = np.array(glob(orig_dir + 'lfw/*/*'))
dog_files = np.array(glob(orig_dir + 'dog_images/*/*/*'))
# print number of images in each dataset
print('There are %d total human images.' % len(human_files))
print('There are %d total dog images.' % len(dog_files))
# global variables used throughout...
model_name = ""
use_weights = False
import torch
use_cuda = torch.cuda.is_available()
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
device
# see https://github.com/pytorch/pytorch/issues/7068
import random
SEED = 0
torch.manual_seed(SEED)
torch.cuda.manual_seed(SEED)
np.random.seed(SEED)
random.seed(SEED)
torch.backends.cudnn.deterministic=True
def get_learning_rate(optimizer):
lr=[]
for param_group in optimizer.param_groups:
lr +=[ param_group['lr'] ]
return lr
def unfreeze(model):
for name, child in model.named_children():
for param in child.parameters():
param.requires_grad = True
unfreeze(child)
def freeze(model):
for name, child in model.named_children():
for param in child.parameters():
param.requires_grad = False
freeze(child)
def compare_dicts(dict_1, dict_2):
dicts_differ = 0
for key_item_1, key_item_2 in zip(dict_1.items(), dict_2.items()):
if torch.equal(key_item_1[1], key_item_2[1]):
pass
else:
dicts_differ += 1
if (key_item_1[0] == key_item_2[0]):
print('Mismtach found at', key_item_1[0])
else:
raise Exception
if dicts_differ == 0:
print('State_Dicts match')
def reloadModel(model, optimizer, state_dicts_name='model_d_502.pth',
learning_rate=1e-05, isV1=False, resetarray=True):
global train_losses, valid_losses, val_acc_history, lr_hist
global best_acc, best_acc_epoch, best_val_epoch, epoch_loss_min
state_dicts = torch.load(state_dicts_name, map_location=lambda storage, loc: storage)
model_statedict = state_dicts['model_statedict']
optimizer_statedict = state_dicts['optimizer_statedict']
if isV1:
model_dict = model.state_dict()
# 1. filter out unnecessary keys
pretrained_dict = {k: v for k, v in model_statedict.items() if k in model_dict}
# 2. overwrite entries in the existing state dict
model_dict.update(pretrained_dict)
# 3. load the new state dict
model.load_state_dict(pretrained_dict)
else:
model.load_state_dict(model_statedict)
parameters = filter(lambda p: p.requires_grad, model.parameters())
optimizer = torch.optim.Adam(parameters, lr=learning_rate)
optimizer.load_state_dict(optimizer_statedict)
best_acc_epoch = state_dicts['best_acc_epoch']
best_val_epoch = state_dicts['best_val_epoch']
if resetarray:
train_losses = state_dicts['train_losses'][:best_acc_epoch]
valid_losses = state_dicts['valid_losses'][:best_acc_epoch]
val_acc_history = state_dicts['val_acc_history'][:best_acc_epoch]
#train_acc_history = state_dicts['train_acc_history'][:best_acc_epoch]
best_acc = val_acc_history[-1]
epoch_loss_min = valid_losses[-1]
else:
train_losses = state_dicts['train_losses']
valid_losses = state_dicts['valid_losses']
val_acc_history = state_dicts['val_acc_history']
#train_acc_history = state_dicts['train_acc_history']
best_acc = val_acc_history[best_acc_epoch]
epoch_loss_min = valid_losses[best_val_epoch]
The following image-handling utility code is adapted from open source code found on the internet or in Udacity provided notebooks...
def imshow(image, ax=None, title=None, color="black", filename=None, normalize=True):
"""Imshow for Tensor."""
global img_means, img_std
if img_means is not None:
imgmeans = img_means
else:
imgmeans = [0.485, 0.456, 0.406]
if img_std is not None:
imgstd = img_std
else:
imgstd = [0.229, 0.224, 0.225]
if isinstance(image, torch.Tensor):
image = image.cpu()
image = copy.deepcopy(image.numpy())
if ax is None:
fig, ax = plt.subplots()
# PyTorch tensors assume the color channel is the first dimension
# but matplotlib assumes is the third dimension
if isinstance(image, np.ndarray):
image = image.transpose((1, 2, 0))
else:
image = image.numpy.transpose((1, 2, 0))
if normalize:
# Undo preprocessing
mean = np.array(imgmeans)
std = np.array(imgstd)
image = std * image + mean
# Image needs to be clipped between 0 and 1 or it looks like noise when displayed
image = np.clip(image, 0, 1)
if title is not None:
ax.set_title(title, color=color)
if filename is not None:
ax.set_xlabel(filename)
ax.imshow(image)
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.spines['left'].set_visible(False)
ax.spines['bottom'].set_visible(False)
ax.tick_params(axis='both', length=0)
ax.set_xticklabels('')
ax.set_yticklabels('')
return ax
def img_resize(img, sz):
''' Resize image to passed size based on the shorter of the image sides, using image aspect ratio
'''
w, h = img.size
aspect_ratio = w / h if h < w else h / w
width = int(sz) if w < h else int(round(sz * aspect_ratio, 0))
height = int(sz) if w > h else int(round(sz * aspect_ratio, 0))
return img.resize((width, height))
def img_crop(img, sz):
''' Return a cropped square region from the centre of an image
'''
w, h = img.size
x = (w - sz) / 2
y = (h - sz) / 2
x1 = x + sz
y1 = y + sz
return img.crop((x, y, x1, y1))
def img_process(img_path, img_sz=224, max_sz=None):
'''
Scale, crop and normalize an image, returning it as a numpy array
'''
global img_means, img_std
if img_means is not None:
imgmeans = img_means
else:
imgmeans = [0.485, 0.456, 0.406]
if img_std is not None:
imgstd = img_std
else:
imgstd = [0.229, 0.224, 0.225]
if max_sz is None:
max_sz = img_sz + (img_sz // 7)
# Open the image
from PIL import Image
img = Image.open(img_path)
# Resize image so shortest side is 256 pixels (for 224 image, 341 is standard for 229)
img = img_resize(img, max_sz)
# Crop image
img = img_crop(img, img_sz)
# Normalize
img = np.array(img) / 255
mean = np.array(imgmeans) # provided mean or ImageNet mean
std = np.array(imgstd) # provided std or ImageNet std
img = (img - mean) / std
# Pytorch requires color channels in the first dimension (opposite of PIL)
img = img.transpose((2, 0, 1))
return img
Have downloaded a dictionary of the entire ImageNet library into the workspace so we can match names to the class values...
os.path.isfile(project_dir + "imagenet1000_clsidx_to_labels.txt")
ImageNetDict = eval(open("imagenet1000_clsidx_to_labels.txt").read())
Set image mean and std to values (see calculations later in notebook) if ImageNet defaults are not wanted...
# Allows for specific calculation of image means and standard deviation,
# defaults are the imagenet values...
img_means = None
img_std = None
img_means = [0.4868, 0.4666, 0.3972]
img_std = [0.2605, 0.2551, 0.2609]
To allow for long-running processes (i.e. network training) import workspace_utils. Using magic command %load to import into next cell.
os.path.isfile(project_dir + "workspace_utils.py")
# %load workspace_utils.py
import signal
from contextlib import contextmanager
import requests
DELAY = INTERVAL = 4 * 60 # interval time in seconds
MIN_DELAY = MIN_INTERVAL = 2 * 60
KEEPALIVE_URL = "https://nebula.udacity.com/api/v1/remote/keep-alive"
TOKEN_URL = "http://metadata.google.internal/computeMetadata/v1/instance/attributes/keep_alive_token"
TOKEN_HEADERS = {"Metadata-Flavor":"Google"}
def _request_handler(headers):
def _handler(signum, frame):
requests.request("POST", KEEPALIVE_URL, headers=headers)
return _handler
@contextmanager
def active_session(delay=DELAY, interval=INTERVAL):
"""
Example:
from workspace_utils import active_session
with active_session():
# do long-running work here
"""
token = requests.request("GET", TOKEN_URL, headers=TOKEN_HEADERS).text
headers = {'Authorization': "STAR " + token}
delay = max(delay, MIN_DELAY)
interval = max(interval, MIN_INTERVAL)
original_handler = signal.getsignal(signal.SIGALRM)
try:
signal.signal(signal.SIGALRM, _request_handler(headers))
signal.setitimer(signal.ITIMER_REAL, delay, interval)
yield
finally:
signal.signal(signal.SIGALRM, original_handler)
signal.setitimer(signal.ITIMER_REAL, 0)
def keep_awake(iterable, delay=DELAY, interval=INTERVAL):
"""
Example:
from workspace_utils import keep_awake
for i in keep_awake(range(5)):
# do iteration with lots of work here
"""
with active_session(delay, interval): yield from iterable
In this section, we use OpenCV's implementation of Haar feature-based cascade classifiers to detect human faces in images.
OpenCV provides many pre-trained face detectors, stored as XML files on github. We have downloaded one of these detectors and stored it in the haarcascades directory. In the next code cell, we demonstrate how to use this detector to find human faces in a sample image.
import cv2
import matplotlib.pyplot as plt
%matplotlib inline
# extract pre-trained face detector
face_cascade = cv2.CascadeClassifier('haarcascades/haarcascade_frontalface_alt.xml')
# load color (BGR) image
img = cv2.imread(human_files[0])
# convert BGR image to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# find faces in image
faces = face_cascade.detectMultiScale(gray)
# print number of faces detected in the image
print('Number of faces detected:', len(faces))
# get bounding box for each detected face
for (x,y,w,h) in faces:
# add bounding box to color image
cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,0),2)
# convert BGR image to RGB for plotting
cv_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
# display the image, along with bounding box
plt.imshow(cv_rgb)
plt.show()
Before using any of the face detectors, it is standard procedure to convert the images to grayscale. The detectMultiScale function executes the classifier stored in face_cascade and takes the grayscale image as a parameter.
In the above code, faces is a numpy array of detected faces, where each row corresponds to a detected face. Each detected face is a 1D array with four entries that specifies the bounding box of the detected face. The first two entries in the array (extracted in the above code as x and y) specify the horizontal and vertical positions of the top left corner of the bounding box. The last two entries in the array (extracted here as w and h) specify the width and height of the box.
We can use this procedure to write a function that returns True if a human face is detected in an image and False otherwise. This function, aptly named face_detector, takes a string-valued file path to an image as input and appears in the code block below.
# returns "True" if face is detected in image stored at img_path
def face_detector(img_path):
img = cv2.imread(img_path)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
faces = face_cascade.detectMultiScale(gray)
return len(faces) > 0
Question 1: Use the code cell below to test the performance of the face_detector function.
human_files have a detected human face? dog_files have a detected human face? Ideally, we would like 100% of human images with a detected face and 0% of dog images with a detected face. You will see that our algorithm falls short of this goal, but still gives acceptable performance. We extract the file paths for the first 100 images from each of the datasets and store them in the numpy arrays human_files_short and dog_files_short.
Answer: (You can print out your results and/or write your percentages in this cell)
from tqdm import tqdm
human_files_short = human_files[:100]
dog_files_short = dog_files[:100]
#-#-# Do NOT modify the code above this line. #-#-#
## TODO: Test the performance of the face_detector algorithm
## on the images in human_files_short and dog_files_short.
def countFaces(faceList):
return [face_detector(img) for img in faceList]
human_faces = countFaces(human_files_short)
dog_faces = countFaces(dog_files_short)
print("{}% of human faces detected".format(sum(human_faces)))
print("{}% of dogs detected as having human faces!".format(sum(dog_faces)))
human_files_short = human_files[:100]
dog_files_short = dog_files[:100]
In the first 1,000 images from each set, 98% of the human faces were detected, conversely 17% of dogs were detected as having human faces
bad_human_faces = [a for a, b in zip(human_files_short, human_faces) if b == False]
bad_dog_faces = [a for a, b in zip(dog_files_short, dog_faces) if b == True]
len(bad_human_faces), len(bad_dog_faces)
for ii in range(len(bad_human_faces)):
print(bad_human_faces[ii])
_, axes = plt.subplots(figsize=(20,6), ncols=2)
for ii in range(2):
ax = axes[ii]
img = cv2.imread(bad_human_faces[ii])
cv_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
ax.imshow(cv_rgb)
for ii in range(len(bad_dog_faces)):
print(bad_dog_faces[ii])
_, axes = plt.subplots(figsize=(20,6), ncols=5)
for ii in range(5):
ax = axes[ii]
img = cv2.imread(bad_dog_faces[ii])
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
faces = face_cascade.detectMultiScale(gray)
for (x,y,w,h) in faces:
cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,0),2)
cv_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
ax.imshow(cv_rgb)
testdogs = [orig_dir + 'dog_images/test/004.Akita/Akita_00258.jpg',
orig_dir + 'dog_images/test/005.Alaskan_malamute/Alaskan_malamute_00330.jpg',
orig_dir + 'dog_images/train/103.Mastiff/Mastiff_06834.jpg',
orig_dir + 'dog_images/test/004.Akita/Akita_00282.jpg']
_, axes = plt.subplots(figsize=(20,6), ncols=4)
for ii in range(4):
ax = axes[ii]
img = cv2.imread(testdogs[ii])
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
faces = face_cascade.detectMultiScale(gray)
for (x,y,w,h) in faces:
cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,0),2)
cv_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
ax.imshow(cv_rgb)
figsize = (20, 10)
fig = plt.figure(figsize=figsize)
img = cv2.imread(testdogs[2])
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
faces = face_cascade.detectMultiScale(gray)
for (x,y,w,h) in faces:
cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,0),2)
cv_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
plt.imshow(cv_rgb)
figsize = (20, 10)
fig = plt.figure(figsize=figsize)
img = cv2.imread(testdogs[3])
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
faces = face_cascade.detectMultiScale(gray)
for (x,y,w,h) in faces:
cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,0),2)
cv_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
plt.imshow(cv_rgb)
Examination of a few images shows the failure of human face detection may be due to poor resolution or focus, contrast and color in the human photos, or if the photo is a profile or otherwise obscured. There may be genuine human faces contained in some of the dog photos, or dogs facing front-on to the camera with clearly discernable features can be interpreted as having a human face. Additionally, geometric shapes that includes something like two eyes and a mouth can be interpreted as a face, and even worse - there can be no understandable reason for face detection!
We suggest the face detector from OpenCV as a potential way to detect human images in your algorithm, but you are free to explore other approaches, especially approaches that make use of deep learning :). Please use the code cell below to design and test your own face detection algorithm. If you decide to pursue this optional task, report performance on human_files_short and dog_files_short.
### (Optional)
### TODO: Test performance of anotherface detection algorithm.
### Feel free to use as many code cells as needed.
data_dir = orig_dir + 'dog_images/'
train_dir = data_dir + 'train/'
valid_dir = data_dir + 'valid/'
test_dir = data_dir + 'test/'
from pathlib import Path
import os
def folders_in_path(path):
if not Path.is_dir(path):
raise ValueError("argument is not a directory")
yield from filter(Path.is_dir, path.iterdir())
def folders_in_depth(path, depth):
if 0 > depth:
raise ValueError("depth smaller 0")
if 0 == depth:
yield from folders_in_path(path)
else:
for folder in folders_in_path(path):
yield from folders_in_depth(folder, depth-1)
def files_in_path(path):
if not Path.is_dir(path):
raise ValueError("argument is not a directory")
yield from filter(Path.is_file, path.iterdir())
def files_per_folder(dir_path, dir_desc):
files_per_folder = []
for folder in folders_in_depth(Path.cwd()/dir_path,0):
files = list(files_in_path(folder))
foldername = os.path.basename(os.path.normpath(folder))
files_per_folder.append((foldername, len(files)))
print(dir_desc)
print("-" * 48)
for image in files_per_folder:
img_name = image[0]+' '*50
img_name = img_name[:45]
img_count = image[1]
print('{} {}'.format(img_name, img_count))
files_per_folder(train_dir, "Training data files per class:")
files_per_folder(valid_dir, "Validation data files per class:")
files_per_folder(test_dir, "Testing data files per class:")
files_per_folder(orig_dir + "lfw/", "Human image counts:")
In this section, we use a pre-trained model to detect dogs in images.
The code cell below downloads the VGG-16 model, along with weights that have been trained on ImageNet, a very large, very popular dataset used for image classification and other vision tasks. ImageNet contains over 10 million URLs, each linking to an image containing an object from one of 1000 categories.
import torch
use_cuda = torch.cuda.is_available()
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
import torch
import torchvision.models as models
# define VGG16 model
VGG16 = models.vgg16(pretrained=True)
# check if CUDA is available
use_cuda = torch.cuda.is_available()
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# move model to GPU if CUDA is available
if use_cuda:
VGG16 = VGG16.cuda()
device
Ensure model is in evaluate mode...
VGG16.eval();
print(VGG16.classifier)
Given an image, this pre-trained VGG-16 model returns a prediction (derived from the 1000 possible categories in ImageNet) for the object that is contained in the image.
ImageNetDict
VGG16.class_to_name = ImageNetDict
The ImageNet defaults are probably quite sufficient for normalisation, since these are ImageNet images. But as they are a subset its perhaps worth experimenting with calculating means and std from just these images, and in any case its a useful tool to have.
The first approach I have developed (informed by Prof Google), the second approach comes from the Udacity student hub. I am using the results from my own approach as the standard deviation figures differ and seem more realistic to me. However, I will alter this if I find a relaible definition of calculating std.
# Allows for specific calculation of image means and standard deviation,
# defaults are the imagenet values...
img_means = None
img_std = None
After running next cell go to "Skip to here to avoid normalization calculations" if wanting to avoid the next few cells...
img_means = [0.4868, 0.4666, 0.3972]
img_std = [0.2605, 0.2551, 0.2609]
Using ImageFile.LOAD_TRUNCATED_IMAGES = True and num_workers = 0 to prevent error when loading corrupted image train/098.Leonberger/Leonberger_06571.jpg. However this was very slow (6 minutes instead of 1 minute), so ideally the image should be deleted...
if os.path.isfile(train_dir + '098.Leonberger/Leonberger_06571.jpg'):
print('Removing',train_dir + '098.Leonberger/Leonberger_06571.jpg')
os.remove(train_dir + '098.Leonberger/Leonberger_06571.jpg')
else:
print(train_dir + '098.Leonberger/Leonberger_06571.jpg was not found')
In the Dog Project Workspace the bad image cannot be deleted. For now I am processing with LOAD_TRUNCATED_IMAGES = True...
from PIL import Image
from PIL import ImageFile
# Try using with num_workers = 0 to handle loading of corrupted files...
ImageFile.LOAD_TRUNCATED_IMAGES = True
import torch
import torchvision.transforms as transforms
from torchvision import datasets, transforms
import os
image_size = 224
# Setting num_workers to zero in the project workspace
# Anything higher gets RuntimeError: DataLoader worker (pid 83) is killed by signal: Bus error.
num_workers = 0
tfms_basic = transforms.Compose([
transforms.Resize((image_size, image_size)),
transforms.ToTensor()
])
dataset1 = datasets.ImageFolder(train_dir, transform=tfms_basic)
dataloader1 = torch.utils.data.DataLoader(dataset1, num_workers=num_workers, batch_size=128)
dataset2 = datasets.ImageFolder(valid_dir, transform=tfms_basic)
dataloader2 = torch.utils.data.DataLoader(dataset2, num_workers=num_workers, batch_size=128)
Method 1 to calculate actual image means and standard deviation of the dogs images
%%time
# Get the mean and standard deviation of all images in train and valid datasets
red_chan = []
gre_chan = []
blu_chan = []
for images, _ in dataloader1:
for image in images:
red_chan.append(image[0])
gre_chan.append(image[1])
blu_chan.append(image[2])
for images, _ in dataloader2:
for image in images:
red_chan.append(image[0])
gre_chan.append(image[1])
blu_chan.append(image[2])
red_channels = torch.cat(red_chan, dim=0)
green_channels = torch.cat(gre_chan, dim=0)
blue_channels = torch.cat(blu_chan, dim=0)
img_means = round(red_channels.mean().item(), 4), round(green_channels.mean().item(), 4), round(blue_channels.mean().item(), 4)
img_means = list(img_means)
img_means
img_std = round(red_channels.std().item(), 4), round(green_channels.std().item(), 4), round(blue_channels.std().item(), 4)
img_std = list(img_std)
img_std
Results of calculating means and std:
img_means = [0.4868, 0.4666, 0.3972]
img_std = [0.2605, 0.2551, 0.2609]
Compare with ImageNet defaults:
imgmeans = [0.485, 0.456, 0.406]
imgstd = [0.229, 0.224, 0.225]
Method 2 from the Udacity Student Hub gives a different result for the std...
accumulated = torch.from_numpy(np.zeros((3, image_size * image_size))).float()
%%time
for data, *_ in dataset1:
modified = data.view(3, -1)
accumulated.add_(modified)
for data, *_ in dataset2:
modified = data.view(3, -1)
accumulated.add_(modified)
means = accumulated.mean(dim=1) / (len(dataset1) + len(dataset2))
stds = accumulated.std(dim=1) / (len(dataset1) + len(dataset2))
Method 2 results:
means = [0.4861, 0.4560, 0.3918]
stds = [0.0070, 0.0189, 0.0104]
In the next code cell, you will write a function that accepts a path to an image (such as 'dogImages/train/001.Affenpinscher/Affenpinscher_00001.jpg') as input and returns the index corresponding to the ImageNet class that is predicted by the pre-trained VGG-16 model. The output should always be an integer between 0 and 999, inclusive.
Before writing the function, make sure that you take the time to learn how to appropriately pre-process tensors for pre-trained models in the PyTorch documentation.
from PIL import Image
from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True
import torchvision.transforms as transforms
import torch.nn.functional as F
def VGG16_predict(img_path, topk=1):
'''
Use pre-trained VGG-16 model to obtain index corresponding to
predicted ImageNet class for image at specified path
Args:
img_path: path to an image
topk: number of predictions to return (allow for top 5 for instance)
Returns:
Index(s) corresponding to VGG-16 model's prediction
Probability of the prediction(s)
'''
## TODO: Complete the function.
## Load and pre-process an image from the given img_path
## Return the *index* of the predicted class for that image
img = img_process(img_path)
img_tensor = torch.from_numpy(img).type(torch.FloatTensor)
img_tensor.unsqueeze_(0)
VGG16.cpu()
VGG16.eval()
with torch.no_grad():
log_ps = F.softmax(VGG16.forward(img_tensor), dim=1)
probs, classes = torch.topk(log_ps, k=topk)
probs = probs.view(topk).detach().numpy().tolist()
classes = classes.view(topk).detach().numpy().tolist()
classnames = [VGG16.class_to_name[cls] for cls in classes]
return classes, classnames, probs
data_dir = orig_dir + 'dog_images/'
train_dir = data_dir + 'train/'
valid_dir = data_dir + 'valid/'
test_dir = data_dir + 'test/'
imshow(img_process(train_dir+'001.Affenpinscher/Affenpinscher_00001.jpg'), title='Affenpinscher')
VGG16_predict(train_dir+'001.Affenpinscher/Affenpinscher_00001.jpg')
From the dictionary: 252: 'affenpinscher, monkey pinscher, monkey dog',
While looking at the dictionary, you will notice that the categories corresponding to dogs appear in an uninterrupted sequence and correspond to dictionary keys 151-268, inclusive, to include all categories from 'Chihuahua' to 'Mexican hairless'. Thus, in order to check to see if an image is predicted to contain a dog by the pre-trained VGG-16 model, we need only check if the pre-trained model predicts an index between 151 and 268 (inclusive).
Use these ideas to complete the dog_detector function below, which returns True if a dog is detected in an image (and False if not).
### returns "True" if a dog is detected in the image stored at img_path
def dog_detector(img_path):
## TODO: Complete the function.
prediction = VGG16_predict(img_path)
dog_detected = (prediction[0][0] in range(151, 269))
return dog_detected
dog_detector(train_dir+'001.Affenpinscher/Affenpinscher_00001.jpg')
Question 2: Use the code cell below to test the performance of your dog_detector function.
human_files_short have a detected dog? dog_files_short have a detected dog?Answer:
%%time
### TODO: Test the performance of the dog_detector function
### on the images in human_files_short and dog_files_short.
def countDogs(ImageList):
return [dog_detector(img) for img in ImageList]
human_dogs = countDogs(human_files_short)
dog_dogs = countDogs(dog_files_short)
print("{}% of dogs detected in human data".format(sum(human_dogs)))
print("{}% of dogs detected in dogs data".format(sum(dog_dogs)))
bad_human_dogs = [a for a, b in zip(human_files_short, human_dogs) if b == True]
bad_dog_dogs = [a for a, b in zip(dog_files_short, dog_dogs) if b == False]
for ii in range(len(bad_human_dogs)):
print(bad_human_dogs[ii])
VGG16_predict(bad_human_dogs[0])
_, axes = plt.subplots(figsize=(20,6), ncols=2)
for ii in range(2):
ax = axes[ii]
if ii == 0:
img = cv2.imread(bad_human_dogs[0])
else:
img = cv2.imread(test_dir+'033.Bouvier_des_flandres/Bouvier_des_flandres_02305.jpg')
cv_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
ax.imshow(cv_rgb)
for ii in range(len(bad_dog_dogs)):
print(bad_dog_dogs[ii])
plt.subplots(figsize=(20,6), ncols=1)
img = cv2.imread(bad_dog_dogs[0])
cv_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
plt.imshow(cv_rgb)
We suggest VGG-16 as a potential network to detect dog images in your algorithm, but you are free to explore other pre-trained networks (such as Inception-v3, ResNet-50, etc). Please use the code cell below to test other pre-trained PyTorch models. If you decide to pursue this optional task, report performance on human_files_short and dog_files_short.
### (Optional)
### TODO: Report the performance of another pre-trained network.
### Feel free to use as many code cells as needed.
data_dir = work_dir + 'dog_images/'
train_dir = data_dir + 'train/'
valid_dir = data_dir + 'valid/'
test_dir = data_dir + 'test/'
if os.path.isfile(train_dir + '098.Leonberger/Leonberger_06571.jpg'):
print('Removing',train_dir + '098.Leonberger/Leonberger_06571.jpg')
os.remove(train_dir + '098.Leonberger/Leonberger_06571.jpg')
else:
print(train_dir + '098.Leonberger/Leonberger_06571.jpg was not found')
from PIL import Image
from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True
data_dir = orig_dir + 'dog_images/'
train_dir = data_dir + 'train/'
valid_dir = data_dir + 'valid/'
test_dir = data_dir + 'test/'
Now that we have functions for detecting humans and dogs in images, we need a way to predict breed from images. In this step, you will create a CNN that classifies dog breeds. You must create your CNN from scratch (so, you can't use transfer learning yet!), and you must attain a test accuracy of at least 10%. In Step 4 of this notebook, you will have the opportunity to use transfer learning to create a CNN that attains greatly improved accuracy.
We mention that the task of assigning breed to dogs from images is considered exceptionally challenging. To see why, consider that even a human would have trouble distinguishing between a Brittany and a Welsh Springer Spaniel.
| Brittany | Welsh Springer Spaniel |
|---|---|
![]() |
![]() |
It is not difficult to find other dog breed pairs with minimal inter-class variation (for instance, Curly-Coated Retrievers and American Water Spaniels).
| Curly-Coated Retriever | American Water Spaniel |
|---|---|
![]() |
![]() |
Likewise, recall that labradors come in yellow, chocolate, and black. Your vision-based algorithm will have to conquer this high intra-class variation to determine how to classify all of these different shades as the same breed.
| Yellow Labrador | Chocolate Labrador | Black Labrador |
|---|---|---|
![]() |
![]() |
![]() |
We also mention that random chance presents an exceptionally low bar: setting aside the fact that the classes are slightly imabalanced, a random guess will provide a correct answer roughly 1 in 133 times, which corresponds to an accuracy of less than 1%.
Remember that the practice is far ahead of the theory in deep learning. Experiment with many different architectures, and trust your intuition. And, of course, have fun!
data_dir = work_dir + 'dog_images/'
train_dir = data_dir + 'train/'
valid_dir = data_dir + 'valid/'
test_dir = data_dir + 'test/'
data_dir = orig_dir + 'dog_images/'
train_dir = data_dir + 'train/'
valid_dir = data_dir + 'valid/'
test_dir = data_dir + 'test/'
Use the code cell below to write three separate data loaders for the training, validation, and test datasets of dog images (located at dogImages/train, dogImages/valid, and dogImages/test, respectively). You may find this documentation on custom datasets to be a useful resource. If you are interested in augmenting your training and/or validation data, check out the wide variety of transforms!
import os
import torch
from torchvision import datasets, transforms
from PIL import Image
from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True
### TODO: Write data loaders for training, validation, and test sets
## Specify appropriate transforms, and batch_sizes
batch_size = 32
num_workers = 0 # Using 0 in Dog Project Workspace, otherwise 4
image_size = 224 # 180
global img_means, img_std
if img_means is not None:
imgmeans = img_means
else:
imgmeans = [0.485, 0.456, 0.406]
if img_std is not None:
imgstd = img_std
else:
imgstd = [0.229, 0.224, 0.225]
data_transforms = {
'train': transforms.Compose([transforms.RandomAffine(15, translate=(0.1, 0.1), scale=(1.0, 1.5),
shear=None, resample=Image.BILINEAR,
fillcolor=0),
transforms.Resize(image_size + (image_size//7),
interpolation=Image.BILINEAR),
transforms.CenterCrop(image_size),
transforms.RandomHorizontalFlip(),
transforms.ColorJitter(brightness=0.3, contrast=0.3,
saturation=0.2, hue=0.05),
transforms.ToTensor(),
transforms.Normalize(imgmeans, imgstd)
]),
'valid': transforms.Compose([transforms.Resize(image_size + (image_size//7)),
transforms.CenterCrop(image_size),
transforms.ToTensor(),
transforms.Normalize(imgmeans, imgstd)
]),
'test' : transforms.Compose([transforms.Resize(image_size + (image_size//7)),
transforms.CenterCrop(image_size),
transforms.ToTensor(),
transforms.Normalize(imgmeans, imgstd)
])
}
image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x),
data_transforms[x])
for x in ['train', 'valid', 'test']}
class_names = image_datasets['train'].classes
dataset_sizes = {x: len(image_datasets[x]) for x in ['train', 'valid', 'test']}
loaders_scratch = {x: torch.utils.data.DataLoader(image_datasets[x],
batch_size=batch_size,
shuffle=False if x[:4]=='test' else True,
num_workers=num_workers)
for x in ['train', 'valid', 'test']}
class_names
class_len = len(class_names)
class_len
dataset_sizes
Create a tensor of class weights in case I want to cope with class imbalances...
from collections import defaultdict
class_counts = defaultdict(int)
for _, c in image_datasets["train"].imgs:
class_counts[c] += 1
class_weights = [1-(float(class_counts[class_id])/len(image_datasets["train"].imgs))
for class_id in range(len(image_datasets["train"].classes))]
class_weights = torch.FloatTensor(class_weights)
class_weights.to(device)
Question 3: Describe your chosen procedure for preprocessing the data.
Answer:
The dataloaders process images via Pytorch datasets, which incorporate Pytorch transforms on the images.
Each transforms resizes images to a maximum that allows for a subsequent center crop to reduce the image size to the desired value.
A custom normalisation can be used via the variables img_means and img_std, but ImageNet normalization is used by default if these variables are None.
Additionally, the train transforms employs image augmentation which incorporates RandomAffine for rotation and stretching the image, and uses color jitter to introduce color and brightness variations. This is important to increase the variation of training images which will help prevent overfitting.
An initial image size of 180x180 was chosen as a smaller size allows for more reasonable batch sizes and faster processing, important when testing a network being trained from scratch. This was subsequently adjusted up to 224x224 once a useful architecture was deduced, to conform with an expected ImageNet size.
A tensor of class weights was created in case it's required to offset class imbalances.
I am using num_workers at 0 in the Dog Project workspace as I get an error allocating memory with a higher setting. On my personal system I use 4, which loads the data three to four times faster (GPU is also faster on the PC)...
I have implmented an apdaptive pooling approach for the classifier, which allows me to simply specify the desired output to the classifier without having to do a calculation and hard-wiring the incoming tensor size. See Jeremy Howard's nn tutorial for a mention of this, where he says "replace nn.AvgPool2d with nn.AdaptiveAvgPool2d, which allows us to define the size of the output tensor we want, rather than the input tensor we have. As a result, our model will work with any size input".
The advantage of this approach is not only that any image size can be freely used, but that in allowing that ability image sizes can be altered during learning, which has been noted to wake up the process and allow another step of model improvement, reducing overfitting. I used this to good effect in my model used for the Pytorch Challenge.
I have created a version of the approach taken by fast.ai, where a class AdaptiveConcatPool2d is defined that combines an AdaptiveAvgPool2d with a AdaptiveMaxPool2d, this has the advantage of allowing the model to learn from maximum values as well as averge ones. See more detailed discussions in the following links.
What goes on behind in the fastai library?
What is the distinct usage of the AdaptiveConcatPool2d layer?
# Adapted from fastai...
import torch.nn as nn
class AdaptiveConcatPool2d(nn.Module):
def __init__(self, sz=None):
super().__init__()
sz = sz or (1,1)
self.ap = nn.AdaptiveAvgPool2d(sz)
self.mp = nn.AdaptiveMaxPool2d(sz)
def forward(self, x): return torch.cat([self.mp(x), self.ap(x)], 1)
class Flatten(nn.Module):
def __init__(self):
super(Flatten, self).__init__()
def forward(self, x):
x = x.view(x.size(0), -1)
return x
Create a CNN to classify dog breed. Use the template in the code cell below.
import torch.nn as nn
import torch.nn.functional as F
# define the CNN architecture
class Net(nn.Module):
### TODO: choose an architecture, and complete the class
def __init__(self, num_classes=133):
super(Net, self).__init__()
## Define layers of a CNN
self.layer1 = nn.Sequential(
nn.Conv2d(3, 48, kernel_size=5, stride=1, padding=2, bias=False),
nn.ReLU(),
nn.MaxPool2d(kernel_size=2, stride=2, padding=0)
)
self.layer2 = nn.Sequential(
nn.Conv2d(48, 96, kernel_size=3, stride=1, padding=1, bias=False),
nn.ReLU(),
nn.MaxPool2d(kernel_size=2, stride=1, padding=0)
)
self.layer3 = nn.Sequential(
nn.Conv2d(96, 128, kernel_size=3, stride=1, padding=1, bias=False),
nn.ReLU(),
nn.MaxPool2d(kernel_size=2, stride=1, padding=0)
)
self.classifier = nn.Sequential(
AdaptiveConcatPool2d(),
Flatten(),
nn.BatchNorm1d(128*2, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True),
nn.Dropout(p=0.25),
nn.Linear(in_features=128*2, out_features=512, bias=True),
nn.ReLU(),
nn.BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True),
nn.Dropout(p=0.25),
nn.Linear(in_features=512, out_features=num_classes, bias=True)
)
def forward(self, x):
## Define forward behavior
x = self.layer1(x)
x = self.layer2(x)
x = self.layer3(x)
x = self.classifier(x)
return x
#-#-# You so NOT have to modify the code below this line. #-#-#
# instantiate the CNN
model_scratch = Net()
# move tensors to GPU if CUDA is available
if use_cuda:
model_scratch.cuda()
print(model_scratch)
!pip install torchsummary
from torchsummary import summary
summary(model_scratch, (3, 224, 224))
Question 4: Outline the steps you took to get to your final CNN architecture and your reasoning at each step.
Answer:
This was very much a trial and error process, and even then I know that there must be many better ideas than the ones I settled on. I am not yet familiar enough with the domain to know exactly what I should do...
Firstly I set up fairly deep network based on the initial layers of resnet18, but I found it was not progressing at all well, so I went back to basics and tried the following configurations, on the basis that increasing the CNN complexity and regularization should be beneficial up to a point, after which training the network requires more resources and knowledge than I can apply, and there is no further benefit.
Configurations assessed:
Summary of Results:
The best performing network for accuracy was the 48, 96, 128 node network, which I describe below, but some key points are that
Kernel Size:
Kernel size of the first layer was tested at 3, 5, and 7. A single kernel size of 3 did not perform as well as a single kernel of 5 in these configurations, but 7 was very difficult to train needing more resources than I had to make progess with it. Two size 3 kernels also demanded more resources than I had - though this may have been more to do with other aspects of that model's architecture.
Discussions 1, 2 regarding kernel size indicate that two stacked 3x3 kernels are a standard and a better choice than a single 5x5 kernel, so given time and more understanding I would like to revisit this and resolve the issues I had with this architecture.
Architecture of 48, 96, 128 CNN:
Net(
(layer1): Sequential(
(0): Conv2d(3, 48, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), bias=False)
(1): ReLU()
(2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(layer2): Sequential(
(0): Conv2d(48, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): ReLU()
(2): MaxPool2d(kernel_size=2, stride=1, padding=0, dilation=1, ceil_mode=False)
)
(layer3): Sequential(
(0): Conv2d(96, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): ReLU()
(2): MaxPool2d(kernel_size=2, stride=1, padding=0, dilation=1, ceil_mode=False)
)
(classifier): Sequential(
(0): AdaptiveConcatPool2d(
(ap): AdaptiveAvgPool2d(output_size=(1, 1))
(mp): AdaptiveMaxPool2d(output_size=(1, 1))
)
(1): Flatten()
(2): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): Dropout(p=0.25)
(4): Linear(in_features=256, out_features=512, bias=True)
(5): ReLU()
(6): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(7): Dropout(p=0.25)
(8): Linear(in_features=512, out_features=133, bias=True)
)
)
Explanation of architecture:
Set use_weights to True if wanting to use the weights to compensate for class imbalance
use_weights = False
learning_rate = 1e-5
Use the next code cell to specify a loss function and optimizer. Save the chosen loss function as criterion_scratch, and the optimizer as optimizer_scratch below.
import torch.optim as optim
### TODO: select loss function
if use_weights:
criterion_scratch = nn.CrossEntropyLoss(weight=class_weights.to(device), reduction='sum')
else:
criterion_scratch = nn.CrossEntropyLoss()
### TODO: select optimizer
optimizer_scratch = torch.optim.Adam(model_scratch.parameters(), lr=learning_rate)
Code adapted from Sylvain Gugger's How Do You Find A Good Learning Rate
Testing the model with the learning rate finder will leave it with weights already set, so I have found that after finding the learning rate I need to re-create the model to start with a clean slate.
import math
def find_lr(model, optimizer, criterion, dataloaders, init_value = 1e-8,
final_value=10., beta = 0.98):
num = len(dataloaders['train'])-1
mult = (final_value / init_value) ** (1/num)
lr = init_value
optimizer.param_groups[0]['lr'] = lr
avg_loss = 0.
best_loss = 0.
batch_num = 0
losses = []
log_lrs = []
for images, labels in dataloaders['train']:
batch_num += 1
#Get the loss for this mini-batch of images/outputs
images = images.to(device)
labels = labels.to(device)
optimizer.zero_grad()
output = model.forward(images)
loss = criterion(output, labels)
#Compute the smoothed loss
avg_loss = beta * avg_loss + (1-beta) * loss.item()
smoothed_loss = avg_loss / (1 - beta**batch_num)
#Stop if the loss is exploding
if batch_num > 1 and smoothed_loss > 4 * best_loss:
return log_lrs, losses
#Record the best loss
if smoothed_loss < best_loss or batch_num==1:
best_loss = smoothed_loss
#Store the values
losses.append(smoothed_loss)
log_lrs.append(math.log10(lr))
#Do the SGD step
loss.backward()
optimizer.step()
#Update the lr for the next step
lr *= mult
optimizer.param_groups[0]['lr'] = lr
return log_lrs, losses
with active_session():
logs, losses = find_lr(model_scratch, optimizer_scratch, criterion_scratch, loaders_scratch)
plt.rcParams['figure.figsize'] = [9.5, 6]
plt.plot(logs[10:-5],losses[10:-5])
The learning rate finder is a little hard to interpret, but probably 1e-5 is a good choice...
print("{:.10f}".format(1e-05))
print("{:.10f}".format(1e-06))
print("{:.10f}".format(1e-07))
learning_rate = 1e-05
optimizer_scratch = torch.optim.Adam(model_scratch.parameters(), lr=learning_rate)
get_learning_rate(optimizer_scratch)
Some of the code used below was adapted from https://pytorch.org/tutorials/beginner/finetuning_torchvision_models_tutorial.html
train_losses, valid_losses, train_acc_history, val_acc_history, lr_hist = [], [], [], [], []
best_acc, best_val_epoch, best_acc_epoch, epoch_loss_min = 0.0, 0, 0, np.Inf
Train and validate your model in the code cell below. Save the final model parameters at filepath 'model_scratch.pt'.
def train(n_epochs, loaders, model, optimizer, criterion, use_cuda, save_path,
use_weights=False):
"""Returns trained model"""
global train_losses, valid_losses, val_acc_history, lr_hist
global best_acc, best_val_epoch, best_acc_epoch, epoch_loss_min
curr_lr = get_learning_rate(optimizer)
best_model_wts = copy.deepcopy(model.state_dict())
best_optim_wts = copy.deepcopy(optimizer.state_dict())
since = time.time()
# initialize tracker for minimum validation loss
valid_loss_min = np.Inf
# also track maximum accuracy...
valid_acc_max = 0
for epoch in range(1, n_epochs+1):
# initialize variables to monitor training and validation loss
train_loss = 0.0
valid_loss = 0.0
valid_acc = 0.0
###################
# train the model #
###################
model.train()
running_loss = 0.0
running_corrects = 0
for batch_idx, (data, target) in enumerate(loaders['train']):
# move to GPU
if use_cuda:
data, target = data.cuda(), target.cuda()
## find the loss and update the model parameters accordingly
optimizer.zero_grad()
outputs = model(data)
loss = criterion(outputs, target)
_, preds = torch.max(outputs, 1)
loss.backward()
optimizer.step()
running_loss += loss.item() * (1 if use_weights==True else data.size(0))
running_corrects += torch.sum(preds == target.data)
## Suggested approach from the project workbook...
## record the average training loss, using something like
train_loss = train_loss + ((1 / (batch_idx + 1)) * (loss.data - train_loss))
epoch_loss = running_loss / dataset_sizes['train']
epoch_acc = running_corrects.double() / dataset_sizes['train']
train_losses.append(epoch_loss)
train_acc_history.append(epoch_acc)
curr_lr = get_learning_rate(optimizer)
lr_hist.append(curr_lr)
######################
# validate the model #
######################
model.eval()
running_loss = 0.0
running_corrects = 0
for batch_idx, (data, target) in enumerate(loaders['valid']):
# move to GPU
if use_cuda:
data, target = data.cuda(), target.cuda()
## update the average validation loss
val_outputs = model(data)
val_loss = criterion(val_outputs, target)
_, preds = torch.max(val_outputs, 1)
running_loss += val_loss.item() * (1 if use_weights==True else data.size(0))
running_corrects += torch.sum(preds == target.data)
valid_loss = valid_loss + ((1 / (batch_idx + 1)) * (val_loss.data - valid_loss))
epoch_val_loss = running_loss / dataset_sizes['valid']
epoch_val_acc = running_corrects.double() / dataset_sizes['valid']
valid_losses.append(epoch_val_loss)
val_acc_history.append(epoch_val_acc.item())
# print training/validation statistics
# Tested and found that these calculations render identical results, so keeping my approach...
# print('Epoch: {} \tTraining Loss: {:.6f} \tValidation Loss: {:.6f}'.format(
# epoch,
# train_loss,
# valid_loss
# ))
print('Epoch: {} \tTraining Loss: {:.6f} \tValidation Loss: {:.6f}'.format(
epoch,
epoch_loss,
epoch_val_loss
))
print('Epoch: {} \tTraining Accuracy: {:.6f} \tValidation Accuracy: {:.6f}'.format(
epoch,
epoch_acc,
epoch_val_acc
))
## TODO: save the model if validation loss has decreased
if epoch_val_acc > best_acc:
filename = save_path + '_acc' # + "_" + str(epoch)
print('Accuracy has increased ({:.6f} --> {:.6f}) Saving model as {}...'.format(
best_acc, epoch_val_acc, filename))
best_acc_epoch = epoch
best_acc = epoch_val_acc
best_model_wts = copy.deepcopy(model.state_dict())
best_optim_wts = copy.deepcopy(optimizer.state_dict())
torch.save(best_model_wts, filename + '.pt')
torch.save(best_optim_wts, filename + '_optimizer.pt')
if epoch_val_loss <= epoch_loss_min:
filename = save_path + '_val' # + "_" + str(epoch)
print('Validation loss decreased ({:.6f} --> {:.6f}) Saving model as {}...'.format(
epoch_loss_min, epoch_val_loss, filename))
best_val_epoch = epoch
epoch_loss_min = epoch_val_loss
torch.save(model.state_dict(), filename + '.pt')
torch.save(optimizer.state_dict(), filename + '_optimizer.pt')
print()
time_elapsed = time.time() - since
print()
print('Training complete in {:.0f}m {:.0f}s'.format(time_elapsed // 60, time_elapsed % 60))
print('Best validation accuracy: {:4f}'.format(best_acc))
print('Best accuracy epoch : {}'.format(best_acc_epoch))
print('Best validation loss : {:4f}'.format(epoch_loss_min))
print('Best validation epoch : {}'.format(best_val_epoch))
# load best accuracy model weights
model.load_state_dict(best_model_wts)
optimizer.load_state_dict(best_optim_wts)
# Not automatically overwriting model_scratch state_dict as this may not be the best model...
# # Save to required model_scratch
# torch.save(model.state_dict(), save_path + '.pt')
# torch.save(optimizer.state_dict(), save_path + '_optimizer.pt')
# return trained model
return model
train_losses, valid_losses, train_acc_history, val_acc_history, lr_hist = [], [], [], [], []
best_acc, best_val_epoch, best_acc_epoch, epoch_loss_min = 0.0, 0, 0, np.Inf
model_scratch.pt use the following cellmodel_scratch.pt use the separate cell belowwith active_session():
# train the model
model_scratch = train(300, loaders_scratch, model_scratch, optimizer_scratch,
criterion_scratch, use_cuda, 'model_scratch')
# load the model that got the best validation accuracy
#model_scratch.load_state_dict(torch.load('model_scratch.pt'))
# Save to required model_scratch
torch.save(model_scratch.state_dict(), 'model_scratch.pt')
torch.save(optimizer_scratch.state_dict(), 'model_scratch_optimizer.pt')
model_scratch.load_state_dict(torch.load('model_scratch.pt'))
optimizer_scratch.load_state_dict(torch.load('model_scratch_optimizer.pt'))
plt.rcParams['figure.figsize'] = [9.5, 6]
plt.plot(train_losses[2:], label="Training loss")
plt.plot(valid_losses[2:], label="Validation loss")
plt.legend(frameon=False)
plt.show()
plt.rcParams['figure.figsize'] = [9.5, 6]
plt.plot(val_acc_history[2:], label="Validation accuracy")
plt.legend(frameon=False)
plt.show()
Using reloadModel to restore model and optimizer from best accuracy state_dicts, and also restore tracking values...
NOTE: When loading a state dict saved under Pytorch 1.0 in a Pytorch 0.4.0 environment an error occurs re unexpected keys:
RuntimeError: Error(s) in loading state_dict for Net:
Unexpected key(s) in state_dict: "layer1.4.num_batches_tracked", "layer2.4.num_batches_tracked", "layer3.4.num_batches_tracked", "classifier.2.num_batches_tracked", "classifier.6.num_batches_tracked".
See Loading part of a pre-trained model, so when reloading a state dict prepared in v 1.0 load it in stages, executing the code in the cell below instead of just model.load_state_dict()...
model_dict = model_scratch.state_dict()
# 1. filter out unnecessary keys
pretrained_dict = {k: v for k, v in best_model_wts.items() if k in model_dict}
# 2. overwrite entries in the existing state dict
model_dict.update(pretrained_dict)
# 3. load the new state dict
model_scratch.load_state_dict(pretrained_dict)
model_scratch.eval();
!dir
state_dicts_name = 'model_d_502.pth'
learning_rate=1e-05
reloadModel(model_scratch, optimizer_scratch, state_dicts_name, learning_rate, isV1=True)
...or, just reloading models and optimizer from either best accuracy or best validation state_dicts
Accuracy...
best_model_wts = torch.load("model_scratch_acc.pt",
map_location=lambda storage, loc: storage)
best_optim_wts = torch.load("model_scratch_acc_optimizer.pt",
map_location=lambda storage, loc: storage)
or Loss...
best_model_wts = torch.load("model_scratch_val.pt",
map_location=lambda storage, loc: storage)
best_optim_wts = torch.load("model_scratch_val_optimizer.pt",
map_location=lambda storage, loc: storage)
Load state dicts into optimizer and model...
optimizer_scratch.load_state_dict(best_optim_wts)
model_scratch.load_state_dict(best_model_wts)
model_scratch.eval();
or, loading a v1.0 state dict...
model_dict = model_scratch.state_dict()
pretrained_dict = {k: v for k, v in best_model_wts.items() if k in model_dict}
model_dict.update(pretrained_dict)
model_scratch.load_state_dict(pretrained_dict)
model_scratch.eval();
best_model_wts = copy.deepcopy(model_scratch.state_dict())
best_optim_wts = copy.deepcopy(optimizer_scratch.state_dict())
torch.save(best_model_wts, 'model_scratch.pt')
torch.save(best_optim_wts, 'model_scratch_optimizer.pt')
checkpt = 'model_d_502.pth'
torch.save({'model_statedict':model_scratch.state_dict(),
'optimizer_statedict':optimizer_scratch.state_dict(),
'best_acc_epoch' : 402 + 88,
'best_val_epoch' : 402 + 99,
'train_losses' : train_losses,
'valid_losses' : valid_losses,
'val_acc_history' : val_acc_history,
'train_acc_history' : train_acc_history},
checkpt)
Try out your model on the test dataset of dog images. Use the code cell below to calculate and print the test loss and accuracy. Ensure that your test accuracy is greater than 10%.
def test(loaders, model, criterion, use_cuda):
# monitor test loss and accuracy
test_loss = 0.
correct = 0.
total = 0.
model.eval()
for batch_idx, (data, target) in enumerate(loaders['test']):
# move to GPU
if use_cuda:
data, target = data.cuda(), target.cuda()
# forward pass: compute predicted outputs by passing inputs to the model
output = model(data)
# calculate the loss
loss = criterion(output, target)
# update average test loss
test_loss = test_loss + ((1 / (batch_idx + 1)) * (loss.data - test_loss))
# convert output probabilities to predicted class
pred = output.data.max(1, keepdim=True)[1]
# compare predictions to true label
correct += np.sum(np.squeeze(pred.eq(target.data.view_as(pred))).cpu().numpy())
total += data.size(0)
print('Test Loss: {:.6f}\n'.format(test_loss))
print('\nTest Accuracy: %2d%% (%2d/%2d)' % (
100. * correct / total, correct, total))
# call test function
%time test(loaders_scratch, model_scratch, criterion_scratch, use_cuda)
(15% accuracy at 103 epochs, 29% (243/836) after 300 epochs), 37% (316/836) after 490 epochs
Model was run to 285 epochs with 224 size images, then switched to 299 size until epoch 400, then back to 224 to 490 epochs...
Test Loss: 3.024790
Test Accuracy: 27% (228/836)
Training complete in 141m 46s
Best validation accuracy: 0.146108
Best accuracy epoch : 99
Best validation loss : 3.840200
Best validation epoch : 100
Training complete in 136m 32s
Best validation accuracy: 0.231138
Best accuracy epoch : 199
Best validation loss : 3.267654
Best validation epoch : 199
Training complete in 140m 10s
Best validation accuracy: 0.285030
Best accuracy epoch : 297
Best validation loss : 2.968024
Best validation epoch : 300
(15% accuracy at 105 epochs, 26% (225/836) after 300 epochs)
(15% accuracy at 150 epochs)
Training complete in 548m 1s
Best validation accuracy: 0.165269
Best accuracy epoch : 98 (actually 148)
Best validation loss : 3.677971
Best validation epoch : 100 (actually 150)
Test Loss: 3.738435
Test Accuracy: 15% (132/836)
You will now use transfer learning to create a CNN that can identify dog breed from images. Your CNN must attain at least 60% accuracy on the test set.
Use the code cell below to write three separate data loaders for the training, validation, and test datasets of dog images (located at dogImages/train, dogImages/valid, and dogImages/test, respectively).
If you like, you are welcome to use the same data loaders from the previous step, when you created a CNN from scratch.
## TODO: Specify data loaders
import os
from PIL import Image
from PIL import ImageFile
from torchvision import datasets, transforms
ImageFile.LOAD_TRUNCATED_IMAGES = True
# data_dir = work_dir + 'dog_images/'
# train_dir = data_dir + 'train/'
# valid_dir = data_dir + 'valid/'
# test_dir = data_dir + 'test/'
data_dir = orig_dir + 'dog_images/'
train_dir = data_dir + 'train/'
valid_dir = data_dir + 'valid/'
test_dir = data_dir + 'test/'
batch_size = 128
num_workers = 0 ## only 0 works in workspace...
image_size = 224
global img_means, img_std
if img_means is not None:
imgmeans = img_means
else:
imgmeans = [0.485, 0.456, 0.406]
if img_std is not None:
imgstd = img_std
else:
imgstd = [0.229, 0.224, 0.225]
data_transforms = {
'train': transforms.Compose([transforms.RandomAffine(15, translate=(0.1, 0.1), scale=(1.0, 1.5),
shear=None, resample=Image.BILINEAR,
fillcolor=0),
transforms.Resize(image_size + (image_size//7),
interpolation=Image.BILINEAR),
transforms.CenterCrop(image_size),
transforms.RandomHorizontalFlip(),
transforms.ColorJitter(brightness=0.3, contrast=0.3,
saturation=0.2, hue=0.05),
transforms.ToTensor(),
transforms.Normalize(imgmeans, imgstd)
]),
'valid': transforms.Compose([transforms.Resize(image_size + (image_size//7)),
transforms.CenterCrop(image_size),
transforms.ToTensor(),
transforms.Normalize(imgmeans, imgstd)
]),
'test' : transforms.Compose([transforms.Resize(image_size + (image_size//7)),
transforms.CenterCrop(image_size),
transforms.ToTensor(),
transforms.Normalize(imgmeans, imgstd)
])
}
image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x),
data_transforms[x])
for x in ['train', 'valid', 'test']}
class_names = image_datasets['train'].classes
dataset_sizes = {x: len(image_datasets[x]) for x in ['train', 'valid', 'test']}
loaders_transfer = {x: torch.utils.data.DataLoader(image_datasets[x],
batch_size=batch_size,
shuffle=False if x[:4]=='test' else True,
num_workers=num_workers)
for x in ['train', 'valid', 'test']}
Seems to only work with num_workers = 0
class ImageFolderWithPaths(datasets.ImageFolder):
"""Custom dataset that includes image file paths. Extends
torchvision.datasets.ImageFolder
"""
# override the __getitem__ method. this is the method dataloader calls
def __getitem__(self, index):
# this is what ImageFolder normally returns
original_tuple = super(ImageFolderWithPaths, self).__getitem__(index)
# the image file path
path = self.imgs[index][0]
# make a new tuple that includes original and the path
tuple_with_path = (original_tuple + (path,))
return tuple_with_path
test_datasets = {x: ImageFolderWithPaths(os.path.join(data_dir, x),
data_transforms[x])
for x in ['valid', 'test']}
test_dataloaders = {x: torch.utils.data.DataLoader(test_datasets[x],
batch_size=batch_size,
shuffle=False)
for x in ['valid', 'test']}
del model_transfer
Use transfer learning to create a CNN to classify dog breed. Use the code cell below, and save your initialized model as the variable model_transfer.
import torchvision.models as models
import torch.nn as nn
## TODO: Specify model architecture
class AdaptiveConcatPool2d(nn.Module):
def __init__(self, sz=None):
super().__init__()
sz = sz or (1,1)
self.ap = nn.AdaptiveAvgPool2d(sz)
self.mp = nn.AdaptiveMaxPool2d(sz)
def forward(self, x): return torch.cat([self.mp(x), self.ap(x)], 1)
class Flatten(nn.Module):
def __init__(self):
super(Flatten, self).__init__()
def forward(self, x):
x = x.view(x.size(0), -1)
return x
def get_resnet_model(modelname='resnet50', swap=False, sm=False):
if modelname == 'resnet18':
model = models.resnet18(pretrained=True)
elif modelname == 'resnet34':
model = models.resnet34(pretrained=True)
elif modelname == 'resnet50':
model = models.resnet50(pretrained=True)
elif modelname == 'resnet101':
model = models.resnet101(pretrained=True)
elif modelname == 'resnet152':
model = models.resnet152(pretrained=True)
else:
model = models.resnet50(pretrained=True)
for param in model.parameters():
param.requires_grad = False
clf_input_size = model.fc.in_features
clf_output_size = 133 # len(class_names)
nf = clf_input_size * 2 # For flattening ...
if swap==True:
# swapping last 2 layers for new ones - adaptive max pooling and
# more linear layers with dropout
Resnetlayers = []
Resnetlayers.append(AdaptiveConcatPool2d())
Resnetlayers.append(Flatten())
Resnetlayers.append(nn.BatchNorm1d(nf, eps=1e-05, momentum=0.1, affine=True,
track_running_stats=True))
Resnetlayers.append(nn.Dropout(p=0.25))
Resnetlayers.append(nn.Linear(in_features=nf, out_features=512, bias=True))
Resnetlayers.append(nn.ReLU())
Resnetlayers.append(nn.BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True,
track_running_stats=True))
Resnetlayers.append(nn.Dropout(p=0.25))
Resnetlayers.append(nn.Linear(in_features=512, out_features=clf_output_size,
bias=True))
if sm==True:
Resnetlayers.append(nn.LogSoftmax())
Resnetlist = list(model.children())[:-2] + Resnetlayers
model = torch.nn.Sequential(*Resnetlist)
else:
# standard replacing the fc layer to suit the number of classes
classifier = nn.Linear(clf_input_size, clf_output_size)
model.fc = classifier
return model
model_name = "resnet152"
model_transfer = get_resnet_model(model_name, swap=False, sm=False)
if use_cuda:
model_transfer = model_transfer.cuda()
print(model_transfer)
Question 5: Outline the steps you took to get to your final CNN architecture and your reasoning at each step. Describe why you think the architecture is suitable for the current problem.
Answer:
get_resnet_model to replace the classifier, it also has the ability to replace it with an architecture of average pooling and 2 linear layers, which is the same as I used in the model designed from scratch. I tested this configuration and report on the results below.Set use_weights to True if wanting to use the weights to compensate for class imbalance
from collections import defaultdict
class_counts = defaultdict(int)
for _, c in image_datasets["train"].imgs:
class_counts[c] += 1
class_weights = [1-(float(class_counts[class_id])/len(image_datasets["train"].imgs))
for class_id in range(len(image_datasets["train"].classes))]
class_weights = torch.FloatTensor(class_weights)
class_weights.to(device)
use_weights = False
learning_rate = 0.05
Use the next code cell to specify a loss function and optimizer. Save the chosen loss function as criterion_transfer, and the optimizer as optimizer_transfer below.
import torch.optim as optim
if use_weights:
criterion_transfer = nn.CrossEntropyLoss(weight=class_weights.to(device), reduction='sum')
else:
criterion_transfer = nn.CrossEntropyLoss()
parameters = filter(lambda p: p.requires_grad, model_transfer.parameters())
optimizer_transfer = optim.Adam(parameters, lr=learning_rate)
from torch.optim import lr_scheduler
multistep = False
OnPlateau = False
# Decay LR by a factor of 0.1 every 14 epochs (9 for inception)
if OnPlateau == True:
#exp_lr_scheduler = lr_scheduler.ReduceLROnPlateau(optimizer_transfer, mode='max',
# factor=0.5, patience=5, min_lr=0.000001)
exp_lr_scheduler = lr_scheduler.ReduceLROnPlateau(optimizer_transfer, mode='max',
factor=0.5, patience=5)
elif multistep==True:
exp_lr_scheduler = lr_scheduler.MultiStepLR(optimizer_transfer, [7, 13, 23],
gamma=0.05) # gamma=0.1
else:
exp_lr_scheduler = lr_scheduler.StepLR(optimizer_transfer, 10, gamma=0.1)
#exp_lr_scheduler = lr_scheduler.StepLR(optimizer_transfer, 10, gamma=0.05)
# exp_lr_scheduler = lr_scheduler.StepLR(optimizer_transfer, 5, gamma=0.1)
exp_lr_scheduler
with active_session():
logs, losses = find_lr(model_transfer, optimizer_transfer, criterion_transfer, loaders_transfer)
plt.rcParams['figure.figsize'] = [9.5, 6]
plt.plot(logs[10:-7],losses[10:-7])
train_losses, valid_losses, train_acc_history, val_acc_history, lr_hist = [], [], [], [], []
best_acc, best_val_epoch, best_acc_epoch, epoch_loss_min = 0.0, 0, 0, np.Inf
def train_model(model, criterion, optimizer, dataloaders, scheduler, use_scheduler=False,
use_weights=False, num_epochs=50, first_epoch=0, ufilename=None,
is_inception=False):
global train_losses, valid_losses, val_acc_history, lr_hist
global best_acc, best_val_epoch, best_acc_epoch, epoch_loss_min
global model_name
since = time.time()
best_model_wts = copy.deepcopy(model.state_dict())
best_optim_wts = copy.deepcopy(optimizer.state_dict())
curr_lr = get_learning_rate(optimizer)
# if first_epoch is used assume its to restart training at that point using 1-based indexing
# so decrement by 1 then add the value to num_epochs
if first_epoch > 0:
first_epoch -= 1
num_epochs = first_epoch + num_epochs
for epoch in range(first_epoch, num_epochs):
print('Epoch {}/{}'.format(epoch+1, num_epochs))
print('-' * 11)
# Each epoch has a training and validation phase
for phase in ['train', 'valid']:
if phase == 'train':
if use_scheduler==True:
if isinstance(scheduler, torch.optim.lr_scheduler.ReduceLROnPlateau):
# using best accuracy - the max setting
scheduler.step(best_acc)
else:
scheduler.step()
model.train() # Set model to training mode
else:
model.eval() # Set model to evaluate mode
running_loss = 0.0
running_corrects = 0
# Iterate over data.
for inputs, labels in dataloaders[phase]:
inputs, labels = inputs.to(device), labels.to(device)
# zero the parameter gradients
optimizer.zero_grad()
# forward
# track history if only in train
with torch.set_grad_enabled(phase == 'train'):
# Get model outputs and calculate loss
# ref https://discuss.pytorch.org/t/how-to-optimize-inception-model-with-auxiliary-classifiers/7958
# Special case for inception because in training it has an auxiliary output.
# In train mode we calculate the loss by summing the final output and
# the auxiliary output
# but in testing we only consider the final output.
outputs = model(inputs)
if isinstance(outputs, tuple):
if epoch == first_epoch:
print("Outputs:", outputs)
if is_inception and phase == 'train':
loss1 = criterion(outputs[0], labels)
loss2 = criterion(outputs[1], labels)
loss = loss1 + 0.4*loss2
else:
loss = sum((criterion(o,labels) for o in outputs))
else:
loss = criterion(outputs, labels)
_, preds = torch.max(outputs, 1)
# backward + optimize only if in training phase
if phase == 'train':
loss.backward()
optimizer.step()
# statistics
running_loss += loss.item() * (1 if use_weights==True else inputs.size(0))
running_corrects += torch.sum(preds == labels.data)
epoch_loss = running_loss / dataset_sizes[phase]
epoch_acc = running_corrects.double() / dataset_sizes[phase]
if phase == 'train':
train_losses.append(epoch_loss)
curr_lr = get_learning_rate(optimizer)
lr_hist.append(curr_lr)
print("Learning rate: {}".format(curr_lr))
if phase == 'valid':
valid_losses.append(epoch_loss)
val_acc_history.append(epoch_acc.item())
print('{} Loss: {:.4f} Acc: {:.4f}'.format(
phase, epoch_loss, epoch_acc))
# deep copy the model
if phase == 'valid' and epoch_acc > best_acc:
filename = model_name + '_acc_'
if ufilename is not None:
filename = filename + ufilename + '_'
else:
if use_scheduler==True:
filename = filename + 'schd_'
if use_weights==True:
filename = filename + 'wgts_'
### filename = filename + str(epoch+1)
if filename[-1:] == "_":
filename = filename[:-1]
print(
'Accuracy has increased ({:.6f} --> {:.6f}). Saving model as {}...'.format(
best_acc, epoch_acc,
filename))
best_acc_epoch = epoch+1
best_acc = epoch_acc
best_model_wts = copy.deepcopy(model.state_dict())
best_optim_wts = copy.deepcopy(optimizer.state_dict())
torch.save(best_model_wts, filename + '.pt')
torch.save(best_optim_wts, filename + '_optimizer.pt')
if phase == 'valid' and epoch_loss <= epoch_loss_min:
filename = model_name + '_'
if ufilename is not None:
filename = filename + ufilename + '_'
else:
if use_scheduler==True:
filename = filename + 'schd_'
if use_weights==True:
filename = filename + 'wgts_'
### filename = filename + str(epoch+1)
if filename[-1:] == "_":
filename = filename[:-1]
print(
'Validation loss decreased ({:.6f} --> {:.6f}). Saving model as {}...'.format(
epoch_loss_min,
epoch_loss,
filename))
best_val_epoch = epoch+1
epoch_loss_min = epoch_loss
torch.save(model.state_dict(), filename + '.pt')
torch.save(optimizer.state_dict(), filename + '_optimizer.pt')
print()
time_elapsed = time.time() - since
print('Training complete in {:.0f}m {:.0f}s'.format(
time_elapsed // 60, time_elapsed % 60))
print('Best validation Accuracy: {:4f}'.format(best_acc))
print('Best accuracy epoch : {}'.format(best_acc_epoch))
print('Best validation Loss : {:4f}'.format(epoch_loss_min))
print('Best validation epoch : {}'.format(best_val_epoch))
# load best model weights
model.load_state_dict(best_model_wts)
optimizer.load_state_dict(best_optim_wts)
# Not automatically overwriting model_transfer state_dict as this may not be the best model...
# # Save to required model_scratch
# torch.save(model.state_dict(), 'model_transfer.pt')
# torch.save(optimizer.state_dict(), 'model_transfer_optimizer.pt')
return model
Train and validate your model in the code cell below. Save the final model parameters at filepath 'model_transfer.pt'.
# experiment with training loop inspired by the 1 cycle approach,
# starting with a low rate, up to very high, then back again...
step = 10
start_epoch = 1
end_epoch = start_epoch + step
start = 1e-05
middle = 1e-03
end = 1e-04
lr = start
start_epoch = 1
with active_session():
# train the model
while lr < middle:
print(lr)
parameters = filter(lambda p: p.requires_grad, model_transfer.parameters())
optimizer_transfer = optim.Adam(parameters, lr=lr)
end_epoch = start_epoch + step - 1
print(str(start_epoch), " --> ", str(end_epoch))
print()
model_transfer = train_model(model_transfer,
criterion_transfer,
optimizer_transfer,
loaders_transfer,
exp_lr_scheduler,
use_scheduler=False,
use_weights=use_weights,
num_epochs=step,
first_epoch=start_epoch,
ufilename="1cycle_1",
is_inception=(model_name=="inception"))
print()
print('=' * 20)
print()
start_epoch = end_epoch + 1
lr *= 10
lr = round(lr, 8)
# set step to 15 for remaining up cycle
step = 15
# break here if wanting to split training before the down cycle
# (if so, will need need its own active_session wrapper)
# with active_session():
step = 15
while lr >= end:
print(lr)
parameters = filter(lambda p: p.requires_grad, model_transfer.parameters())
optimizer_transfer = optim.Adam(parameters, lr=lr)
end_epoch = start_epoch + step - 1
print(str(start_epoch), " --> ", str(end_epoch))
print()
model_transfer = train_model(model_transfer,
criterion_transfer,
optimizer_transfer,
loaders_transfer,
exp_lr_scheduler,
use_scheduler=False,
use_weights=use_weights,
num_epochs=step,
first_epoch=start_epoch,
ufilename="1cycle_1",
is_inception=(model_name=="inception"))
print()
print('=' * 20)
print()
start_epoch = end_epoch + 1
lr *= 1/10
lr = round(lr, 8)
plt.rcParams['figure.figsize'] = [9.5, 6]
plt.plot(train_losses[2:], label="Training loss")
plt.plot(valid_losses[2:], label="Validation loss")
plt.legend(frameon=False)
plt.show()
plt.rcParams['figure.figsize'] = [9.5, 6]
plt.plot(val_acc_history[2:], label="Validation accuracy")
plt.legend(frameon=False)
plt.show()
step = 15
start_epoch = 1
end_epoch = start_epoch + step
start = 0.00005
middle = 0.00005
end = 0.000005
lr = start
start_epoch = 1
with active_session():
while lr <= middle:
print(lr)
parameters = filter(lambda p: p.requires_grad, model_transfer.parameters())
optimizer_transfer = optim.Adam(parameters, lr=lr)
end_epoch = start_epoch + step - 1
print(str(start_epoch), " --> ", str(end_epoch))
print()
model_transfer = train_model(model_transfer,
criterion_transfer,
optimizer_transfer,
loaders_transfer,
exp_lr_scheduler,
use_scheduler=False,
use_weights=use_weights,
num_epochs=step,
first_epoch=start_epoch,
ufilename="1cycle_4",
is_inception=(model_name=="inception"))
print()
print('=' * 20)
print()
start_epoch = end_epoch + 1
lr *= 10
lr = round(lr, 8)
# with active_session():
# Step at 15 per epoch for remaining of training run
step = 15
while lr >= end:
print(lr)
parameters = filter(lambda p: p.requires_grad, model_transfer.parameters())
optimizer_transfer = optim.Adam(parameters, lr=lr)
end_epoch = start_epoch + step - 1
print(str(start_epoch), " --> ", str(end_epoch))
print()
model_transfer = train_model(model_transfer,
criterion_transfer,
optimizer_transfer,
loaders_transfer,
exp_lr_scheduler,
use_scheduler=False,
use_weights=use_weights,
num_epochs=step,
first_epoch=start_epoch,
ufilename="1cycle_4",
is_inception=(model_name=="inception"))
print()
print('=' * 20)
print()
start_epoch = end_epoch + 1
lr *= 1/10
lr = round(lr, 8)
plt.rcParams['figure.figsize'] = [9.5, 6]
plt.plot(train_losses, label="Training loss")
plt.plot(valid_losses, label="Validation loss")
plt.legend(frameon=False)
plt.show()
plt.rcParams['figure.figsize'] = [9.5, 6]
plt.plot(val_acc_history, label="Validation accuracy")
plt.legend(frameon=False)
plt.show()
with active_session():
# train the model
model_transfer = train_model(model_transfer,
criterion_transfer,
optimizer_transfer,
loaders_transfer,
exp_lr_scheduler,
use_scheduler=True,
use_weights=use_weights,
num_epochs=50,
first_epoch=1,
ufilename="std",
is_inception=(model_name=="inception"))
# load the model that got the best validation accuracy (uncomment the line below)
#model_transfer.load_state_dict(torch.load('model_transfer.pt'))
plt.rcParams['figure.figsize'] = [9.5, 6]
plt.plot(train_losses[2:], label="Training loss")
plt.plot(valid_losses[2:], label="Validation loss")
plt.legend(frameon=False)
plt.show()
plt.rcParams['figure.figsize'] = [9.5, 6]
plt.plot(val_acc_history[2:], label="Validation accuracy")
plt.legend(frameon=False)
plt.show()
model_transfer.load_state_dict(torch.load('model_transfer.pt'))
optimizer_transfer.load_state_dict(torch.load('model_transfer_optimizer.pt'))
checkpt = 'model_1cycle_1.pth'
torch.save({'model_statedict':model_transfer.state_dict(),
'optimizer_statedict':optimizer_transfer.state_dict(),
'best_acc_epoch' : 48,
'best_val_epoch' : 48,
'train_losses' : train_losses,
'valid_losses' : valid_losses,
'val_acc_history' : val_acc_history,
'train_acc_history' : train_acc_history},
checkpt)
checkpt = 'model_1cycle_8.pth'
torch.save({'model_statedict':model_transfer.state_dict(),
'optimizer_statedict':optimizer_transfer.state_dict(),
'best_acc_epoch' : 45,
'best_val_epoch' : 57,
'train_losses' : train_losses,
'valid_losses' : valid_losses,
'val_acc_history' : val_acc_history,
'train_acc_history' : train_acc_history},
checkpt)
checkpt = 'model_standard.pth'
torch.save({'model_statedict':model_transfer.state_dict(),
'optimizer_statedict':optimizer_transfer.state_dict(),
'best_acc_epoch' : 45,
'best_val_epoch' : 45,
'train_losses' : train_losses,
'valid_losses' : valid_losses,
'val_acc_history' : val_acc_history,
'train_acc_history' : train_acc_history},
checkpt)
checkpt = 'model_standard_mk2.pth'
torch.save({'model_statedict':model_transfer.state_dict(),
'optimizer_statedict':optimizer_transfer.state_dict(),
'best_acc_epoch' : 24,
'best_val_epoch' : 41,
'train_losses' : train_losses,
'valid_losses' : valid_losses,
'val_acc_history' : val_acc_history,
'train_acc_history' : train_acc_history},
checkpt)
These checkpoint dictionaries contain the various arrays creatd during training, so can be used to recreate charts
state_dicts_name = 'model_1cycle_1.pth'
learning_rate=0.001
reloadModel(model_transfer, optimizer_transfer, state_dicts_name, learning_rate, isV1=True)
state_dicts_name = 'model_1cycle_8.pth'
learning_rate=0.005
reloadModel(model_transfer, optimizer_transfer, state_dicts_name, learning_rate, isV1=True)
state_dicts_name = 'model_standard_mk2.pth'
learning_rate=0.05
reloadModel(model_transfer, optimizer_transfer, state_dicts_name, learning_rate, isV1=False)
test(loaders_transfer, model_transfer, criterion_transfer, use_cuda)
test(loaders_transfer, model_transfer, criterion_transfer, use_cuda)
test(loaders_transfer, model_transfer, criterion_transfer, use_cuda)
The best performing training cycle was using the "1cycle" approach with a very slow build up from a learning rate of 1e-07, until 1e-03, then back to 1e-04.
But this was compared with a number of other approaches, including an initial standard approach for comparison.
With the "1cycle" approach the first thing I tried was to go all the way from 1e-07 up to 1e-01 and back to 1e-08. This showed that there was no further useful activity above 1e-03, and on the way back down nothing useful happening beyond 1e-04.
I also assessed cycles around a 5e-03 peak as variations around this were more in line with the ideal learning rate.
Test Loss: 0.357339
Test Accuracy: 90% (753/836)
Test Loss: 0.370314
Test Accuracy: 89% (751/836)
Test Loss: 3.045314 - 1 cycle was 0.357339
Test Accuracy: 88% (740/836) - 1 cycle was 90% (753/836)
The "1 cycle" approach has worked fairly well, it is gained a 1.5% increase in accuracy and the maintenance of a much closer relationship between training and validation losses, when compared with a standard approach (and not truncating the accuracy calculation). There is more to investigate about it so that I implement it more consistently and according to the intended design, but even this hack at it was an eye-opener.
import torch
import torch.nn.functional as F
import torchvision.models as models
ImageNetDict = eval(open("imagenet1000_clsidx_to_labels.txt").read())
resnetImageNet = models.resnet152(pretrained=True)
resnetImageNet.class_to_name = ImageNetDict
resnetImageNet.eval();
def ImageNet_predict(img_path, topk=1):
'''
Use pre-trained resnet152 model to obtain index corresponding to
predicted ImageNet class for image at specified path
Args:
img_path: path to an image
topk: number of predictions to return (allow for top 5 for instance)
Returns:
Index(s) corresponding to resnet152 model's prediction
Probability of the prediction(s)
'''
img = img_process(img_path)
img_tensor = torch.from_numpy(img).type(torch.FloatTensor)
img_tensor.unsqueeze_(0)
resnetImageNet.cpu()
resnetImageNet.eval()
with torch.no_grad():
log_ps = F.softmax(resnetImageNet.forward(img_tensor), dim=1)
probs, classes = torch.topk(log_ps, k=topk)
probs = probs.view(topk).detach().numpy().tolist()
classes = classes.view(topk).detach().numpy().tolist()
classnames = [resnetImageNet.class_to_name[cls] for cls in classes]
return classes, classnames, probs
def ImageNet_dog_detector(img_path, inclPrediction=False):
prediction = ImageNet_predict(img_path)
dog_detected = (prediction[0][0] in range(151, 269))
if inclPrediction==True:
return (dog_detected, prediction)
else:
return dog_detected
Write a function that takes an image path as input and returns the dog breed (Affenpinscher, Afghan hound, etc) that is predicted by your model.
### TODO: Write a function that takes a path to an image as input
### and returns the dog breed that is predicted by the model.
# list of class names by index, i.e. a name can be accessed like class_names[0]
class_names = [item[4:].replace("_", " ") for item in image_datasets['train'].classes]
class_examples = eval(open("dog_examples.txt").read())
model_transfer.class_to_name = dict(zip(range(133), class_names))
model_transfer.class_to_example = class_examples
def predict_breed_transfer(img_path, topk=1):
# load the image and return the predicted breed
img = img_process(img_path)
img_tensor = torch.from_numpy(img).type(torch.FloatTensor)
img_tensor.unsqueeze_(0)
model_transfer.cpu()
model_transfer.eval()
with torch.no_grad():
log_ps = F.softmax(model_transfer.forward(img_tensor), dim=1)
probs, classes = torch.topk(log_ps, k=topk)
probs = probs.view(topk).detach().numpy().tolist()
classes = classes.view(topk).detach().numpy().tolist()
classnames = [model_transfer.class_to_name[cls] for cls in classes]
classexamples = [model_transfer.class_to_example[cls] for cls in classes]
return classnames, classes, probs, classexamples
predict_breed_transfer('/data/dog_images/test/011.Australian_cattle_dog/Australian_cattle_dog_00728.jpg', 5)
Write an algorithm that accepts a file path to an image and first determines whether the image contains a human, dog, or neither. Then,
You are welcome to write your own functions for detecting humans and dogs in images, but feel free to use the face_detector and human_detector functions developed above. You are required to use your CNN from Step 4 to predict dog breed.
Some sample output for our algorithm is provided below, but feel free to design your own user experience!

### TODO: Write your algorithm.
### Feel free to use as many code cells as needed.
import ntpath
from PIL import Image, ImageFile, ImageFont, ImageDraw
def face_detector2(img_path):
img = cv2.imread(img_path)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
faces = face_cascade.detectMultiScale(gray)
faces_found = len(faces) > 0
if faces_found:
for (x,y,w,h) in faces:
# add bounding box to color image
cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,0),2)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
return (faces_found, img)
def process_prediction(img_path, threshold=0.25):
## handle cases for a human face, dog, and neither
isdog = False
ishuman = False
isunknown = False
altClass = ""
isFromDogData = ('dog_images/' in img_path.lower()) or ('dog_images\\' in img_path.lower())
isFromHumanData = ('lfw/' in img_path.lower()) or ('lfw\\' in img_path.lower())
dog_breed_data = predict_breed_transfer(img_path)
imagenet_data = ImageNet_dog_detector(img_path, True)
altClass = imagenet_data[1][0]
altClassName = imagenet_data[1][1]
prediction = dog_breed_data[2][0]
# Signify a dog if the probabilty of a dog is above a threshold
isdog = (prediction > threshold)
# If the probabilty of a dog falls below a threshold and a standard
# imagenet classifier returns non-dog, make it unknown
if isdog == False:
if imagenet_data[0] == False:
isunknown = True
face_data = face_detector2(img_path)
if face_data[0] > 0 or altClass in [834]:
# could be a human
ishuman = True
img = face_data[1]
else:
img = cv2.imread(img_path)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
# ground truth image... square crop it
# see http://corochann.com/basic-image-processing-tutorial-1220.html
ground_truth_img = cv2.imread(dog_breed_data[3][0])
ground_truth_img = cv2.cvtColor(ground_truth_img, cv2.COLOR_BGR2RGB)
height, width = ground_truth_img.shape[:2]
crop_length = min(height, width)
height_start = (height - crop_length) // 2
width_start = (width - crop_length) // 2
ground_truth_img = ground_truth_img[
height_start:height_start+crop_length,
width_start:width_start+crop_length,
:]
return dog_breed_data, imagenet_data, [isdog, ishuman, isunknown,
isFromDogData, isFromHumanData], img, ground_truth_img
def run_app(img_path, threshold=0.35):
img_name = ntpath.basename(ntpath.dirname(img_path)) + "/" + ntpath.basename(img_path)
subject_name = ntpath.basename(img_path).replace("_", " ")
color = "black"
image_data = process_prediction(img_path, threshold)
prediction = image_data[0][2]
prediction = prediction[0]
imagenetdog = image_data[1][0]
imagenetpred = image_data[1][1][0][0]
imagenetname = image_data[1][1][1][0]
# image_data[2] = [isdog, ishuman, isunknown, isFromDogData, isFromHuamData]
if image_data[2][0] == True and image_data[2][3] == True:
message1 = "You are a dog for sure!"
message2 = ":)"
elif image_data[2][0] == True and image_data[2][1] == True:
message1 = str(round(prediction * 100)) + "% like the guy on the right!"
message2 = "but, maybe you are people..."
elif image_data[2][0] == False and image_data[2][1] == True:
message1 = "Look like a human, yay!"
message2 = "the lucky dog looks like you!"
elif (prediction <= threshold) or (imagenetpred not in range(151, 269)):
message1 = "Somthing doggy about you..."
message2 = "not sure what kind of dog!"
if imagenetdog == False:
if imagenetpred in range(269,280): # wolves etc.
message2 = "maybe a cousin of sorts?"
elif imagenetpred in range(280,294): # cats etc.
message2 = "some cat disagrees :)"
elif imagenetpred in range(294,298): # bears etc.
message2 = "can you bear the thought?"
elif imagenetpred in range(365,385): # primates
message2 = "just monkeying around!"
elif imagenetpred >= 398: # man-made stuff...
message2 = "but that doesn't seem possible!"
else:
message2 = "but nobody is perfect!"
elif image_data[2][0] == True and prediction > threshold:
message1 = "Look like a dog for sure..."
message2 = "!!!"
elif image_data[2][0] == False:
message1 = "What are you?"
message2 = "??????"
elif image_data[2][2] == True:
message1 = "I know you're unknown..."
message2 = "How can that be?"
else:
message1 = "I know nothing!"
message2 = "really..."
_, axes = plt.subplots(figsize=(20,6), ncols=3)
for ii in range(3):
ax = axes[ii]
if ii == 0:
img = image_data[3]
title = subject_name
filename = img_name
elif ii == 1:
newimg = Image.new('RGB', (224, 224), (255, 255, 255))
fnt = ImageFont.truetype('images/calibri.ttf', 18)
d = ImageDraw.Draw(newimg)
d.text((4, 20), message1, font=fnt, fill=(0, 0, 0))
d.text((5, 50), message2, font=fnt, fill=(0, 0, 0))
newimg.save('images/textmsg.jpg')
img = cv2.imread('images/textmsg.jpg')
title = ""
filename = ""
else:
img = image_data[4]
title = image_data[0][0]
title = title[0]
filename = ""
ax.imshow(img)
ax.tick_params(axis='both', length=0)
ax.set_xticklabels('')
ax.set_yticklabels('')
ax.set_title(title, color=color)
ax.set_xlabel(filename)
if ii == 1:
ax.set_axis_off()
if os.path.isfile('images/textmsg.jpg'):
os.remove('images/textmsg.jpg')
run_app('/data/dog_images/test/011.Australian_cattle_dog/Australian_cattle_dog_00728.jpg')
In this section, you will take your new algorithm for a spin! What kind of dog does the algorithm think that you look like? If you have a dog, does it predict your dog's breed accurately? If you have a cat, does it mistakenly think that your cat is a dog?
Test your algorithm at least six images on your computer. Feel free to use any images you like. Use at least two human and two dog images.
Question 6: Is the output better than you expected :) ? Or worse :( ? Provide at least three possible points of improvement for your algorithm.
Answer:
The output is pretty much what I expected:
Points of improvement:
## TODO: Execute your algorithm from Step 6 on
## at least 6 images on your computer.
## Feel free to use as many code cells as needed.
## suggested code, below
for file in np.hstack((human_files[0], human_files[2],
'/data/lfw/Aaron_Guiel/Aaron_Guiel_0001.jpg',
dog_files[0], dog_files[80], dog_files[200])):
run_app(file)
for file in np.hstack(('images/Chris_HS_square.jpg',
'images/Curly-coated_retriever_03896.jpg',
'images/American_water_spaniel_00648.jpg',
'images/Brittany_02625.jpg',
'images/Welsh_springer_spaniel_08203.jpg',
'images/cat.69.jpg',
'images/cat.3.jpg',
'images/cat.10.jpg',
'images/black_bear.jpg',
'images/gorilla.jpg',
'images/chimp.jpg',
'images/that_monkey.jpg',
'images/monkeys.jpg',
'images/goldfish.jpg',
'images/wolf_1.jpg',
'images/wolf_2.jpg',
'images/Labrador_retriever_06455.jpg',
'images/Labrador_retriever_06449.jpg',
'images/Labrador_retriever_06455.jpg',
'images/Labrador_retriever_06457.jpg'
)):
run_app(file)
%%javascript
// Sourced from http://nbviewer.jupyter.org/gist/minrk/5d0946d39d511d9e0b5a
$("#renumber-button").parent().remove();
function renumber() {
// renumber cells in order
var i=1;
IPython.notebook.get_cells().map(function (cell) {
if (cell.cell_type == 'code') {
// set the input prompt
cell.set_input_prompt(i);
// set the output prompt (in two places)
cell.output_area.outputs.map(function (output) {
if (output.output_type == 'execute_result') {
output.execution_count = i;
cell.element.find(".output_prompt").text('Out[' + i + ']:');
}
});
i += 1;
}
});
}
IPython.toolbar.add_buttons_group([{
'label' : 'Renumber',
'icon' : 'fa-list-ol',
'callback': renumber,
'id' : 'renumber-button'
}]);
imshow(img_process(train_dir+'001.Affenpinscher/Affenpinscher_00001.jpg', img_sz=100),
title='Affenpinscher')
imshow(img_process(train_dir+'001.Affenpinscher/Affenpinscher_00001.jpg', img_sz=150),
title='Affenpinscher')
imshow(img_process(train_dir+'001.Affenpinscher/Affenpinscher_00001.jpg', img_sz=180),
title='Affenpinscher')
imshow(img_process(train_dir+'001.Affenpinscher/Affenpinscher_00001.jpg', img_sz=200),
title='Affenpinscher')
imshow(img_process(train_dir+'001.Affenpinscher/Affenpinscher_00001.jpg', img_sz=224),
title='Affenpinscher')
del model_scratch
import torch.nn as nn
import torch.nn.functional as F
# define the CNN architecture
class Net(nn.Module):
### TODO: choose an architecture, and complete the class
def __init__(self, num_classes=133):
super(Net, self).__init__()
## Define layers of a CNN
self.layer1 = nn.Sequential(
nn.Conv2d(3, 32, kernel_size=5, stride=1, padding=2, bias=False),
nn.ReLU(),
nn.MaxPool2d(kernel_size=2, stride=2, padding=0)
)
self.layer2 = nn.Sequential(
nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1, bias=False),
nn.ReLU(),
nn.MaxPool2d(kernel_size=2, stride=2, padding=0)
)
self.layer3 = nn.Sequential(
nn.Conv2d(64, 96, kernel_size=3, stride=1, padding=1, bias=False),
nn.ReLU(),
nn.MaxPool2d(kernel_size=2, stride=2, padding=0)
)
self.classifier = nn.Sequential(
AdaptiveConcatPool2d(),
Flatten(),
nn.BatchNorm1d(96*2, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True),
nn.Dropout(p=0.25),
nn.Linear(in_features=96*2, out_features=512, bias=True),
nn.ReLU(),
nn.BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True),
nn.Dropout(p=0.25),
nn.Linear(in_features=512, out_features=num_classes, bias=True)
)
def forward(self, x):
## Define forward behavior
x = self.layer1(x)
x = self.layer2(x)
x = self.layer3(x)
x = self.classifier(x)
return x
#-#-# You so NOT have to modify the code below this line. #-#-#
# instantiate the CNN
model_scratch = Net()
# move tensors to GPU if CUDA is available
if use_cuda:
model_scratch.cuda()
print(model_scratch)
import torch.nn as nn
import torch.nn.functional as F
# define the CNN architecture
class Net(nn.Module):
### TODO: choose an architecture, and complete the class
def __init__(self, num_classes=133):
super(Net, self).__init__()
## Define layers of a CNN
self.layer1 = nn.Sequential(
nn.Conv2d(3, 32, kernel_size=5, stride=1, padding=2, bias=False),
nn.ReLU(),
nn.BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True),
nn.MaxPool2d(kernel_size=2, stride=2, padding=0)
)
self.layer2 = nn.Sequential(
nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1, bias=False),
nn.ReLU(),
nn.BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True),
nn.MaxPool2d(kernel_size=2, stride=2, padding=0)
)
self.layer3 = nn.Sequential(
nn.Conv2d(64, 96, kernel_size=3, stride=1, padding=1, bias=False),
nn.ReLU(),
nn.BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True),
nn.MaxPool2d(kernel_size=2, stride=2, padding=0)
)
self.classifier = nn.Sequential(
AdaptiveConcatPool2d(),
Flatten(),
nn.BatchNorm1d(96*2, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True),
nn.Dropout(p=0.25),
nn.Linear(in_features=96*2, out_features=512, bias=True),
nn.ReLU(),
nn.BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True),
nn.Dropout(p=0.25),
nn.Linear(in_features=512, out_features=num_classes, bias=True)
)
def forward(self, x):
## Define forward behavior
x = self.layer1(x)
x = self.layer2(x)
x = self.layer3(x)
x = self.classifier(x)
return x
#-#-# You so NOT have to modify the code below this line. #-#-#
# instantiate the CNN
model_scratch = Net()
# move tensors to GPU if CUDA is available
if use_cuda:
model_scratch.cuda()
print(model_scratch)
Net(
(layer1): Sequential(
(0): Conv2d(3, 32, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), bias=False)
(1): ReLU()
(2): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(layer2): Sequential(
(0): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): ReLU()
(2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(layer3): Sequential(
(0): Conv2d(64, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): ReLU()
(2): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(classifier): Sequential(
(0): AdaptiveConcatPool2d(
(ap): AdaptiveAvgPool2d(output_size=(1, 1))
(mp): AdaptiveMaxPool2d(output_size=(1, 1))
)
(1): Flatten()
(2): BatchNorm1d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): Dropout(p=0.25)
(4): Linear(in_features=192, out_features=512, bias=True)
(5): ReLU()
(6): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(7): Dropout(p=0.25)
(8): Linear(in_features=512, out_features=133, bias=True)
)
)
import torch.nn as nn
import torch.nn.functional as F
# define the CNN architecture
class Net(nn.Module):
### TODO: choose an architecture, and complete the class
def __init__(self, num_classes=133):
super(Net, self).__init__()
## Define layers of a CNN
self.layer1 = nn.Sequential(
nn.Conv2d(3, 32, kernel_size=5, stride=1, padding=2, bias=False),
nn.ReLU(),
nn.Conv2d(32, 32, kernel_size=5, stride=1, padding=2, bias=False),
nn.ReLU(),
nn.BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True),
nn.MaxPool2d(kernel_size=2, stride=2, padding=0)
)
self.layer2 = nn.Sequential(
nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1, bias=False),
nn.ReLU(),
nn.Conv2d(64, 64, kernel_size=3, stride=1, padding=1, bias=False),
nn.ReLU(),
nn.BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True),
nn.MaxPool2d(kernel_size=2, stride=2, padding=0)
)
self.layer3 = nn.Sequential(
nn.Conv2d(64, 96, kernel_size=3, stride=1, padding=1, bias=False),
nn.ReLU(),
nn.Conv2d(96, 96, kernel_size=3, stride=1, padding=1, bias=False),
nn.ReLU(),
nn.BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True),
nn.MaxPool2d(kernel_size=2, stride=2, padding=0)
)
self.classifier = nn.Sequential(
AdaptiveConcatPool2d(),
Flatten(),
nn.BatchNorm1d(96*2, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True),
nn.Dropout(p=0.25),
nn.Linear(in_features=96*2, out_features=512, bias=True),
nn.ReLU(),
nn.BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True),
nn.Dropout(p=0.25),
nn.Linear(in_features=512, out_features=num_classes, bias=True)
)
def forward(self, x):
## Define forward behavior
x = self.layer1(x)
x = self.layer2(x)
x = self.layer3(x)
x = self.classifier(x)
return x
#-#-# You so NOT have to modify the code below this line. #-#-#
# instantiate the CNN
model_scratch = Net()
# move tensors to GPU if CUDA is available
if use_cuda:
model_scratch.cuda()
print(model_scratch)
Net(
(layer1): Sequential(
(0): Conv2d(3, 32, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), bias=False)
(1): ReLU()
(2): Conv2d(32, 32, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2), bias=False)
(3): ReLU()
(4): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(5): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(layer2): Sequential(
(0): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): ReLU()
(2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(3): ReLU()
(4): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(5): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(layer3): Sequential(
(0): Conv2d(64, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): ReLU()
(2): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(3): ReLU()
(4): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(5): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(classifier): Sequential(
(0): AdaptiveConcatPool2d(
(ap): AdaptiveAvgPool2d(output_size=(1, 1))
(mp): AdaptiveMaxPool2d(output_size=(1, 1))
)
(1): Flatten()
(2): BatchNorm1d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): Dropout(p=0.25)
(4): Linear(in_features=192, out_features=512, bias=True)
(5): ReLU()
(6): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(7): Dropout(p=0.25)
(8): Linear(in_features=512, out_features=133, bias=True)
)
)
import torch.nn as nn
import torch.nn.functional as F
# define the CNN architecture
class Net(nn.Module):
### TODO: choose an architecture, and complete the class
def __init__(self, num_classes=133):
super(Net, self).__init__()
## Define layers of a CNN
self.layer1 = nn.Sequential(
nn.Conv2d(3, 32, kernel_size=5, stride=1, padding=1, bias=False),
nn.ReLU(),
nn.MaxPool2d(kernel_size=2, stride=2, padding=0)
)
self.layer2 = nn.Sequential(
nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1, bias=False),
nn.ReLU(),
nn.MaxPool2d(kernel_size=2, stride=2, padding=0)
)
self.layer3 = nn.Sequential(
nn.Conv2d(64, 96, kernel_size=3, stride=1, padding=1, bias=False),
nn.ReLU(),
nn.MaxPool2d(kernel_size=2, stride=2, padding=0)
)
self.layer4 = nn.Sequential(
nn.Conv2d(96, 128, kernel_size=3, stride=1, padding=1, bias=False),
nn.ReLU(),
nn.MaxPool2d(kernel_size=2, stride=2, padding=0)
)
self.classifier = nn.Sequential(
AdaptiveConcatPool2d(),
Flatten(),
nn.BatchNorm1d(128*2, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True),
nn.Dropout(p=0.25),
nn.Linear(in_features=128*2, out_features=512, bias=True),
nn.ReLU(),
nn.BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True),
nn.Dropout(p=0.25),
nn.Linear(in_features=512, out_features=num_classes, bias=True)
)
def forward(self, x):
## Define forward behavior
x = self.layer1(x)
x = self.layer2(x)
x = self.layer3(x)
x = self.layer4(x)
x = self.classifier(x)
return x
#-#-# You so NOT have to modify the code below this line. #-#-#
# instantiate the CNN
model_scratch = Net()
# move tensors to GPU if CUDA is available
if use_cuda:
model_scratch.cuda()
print(model_scratch)
Epoch: 1 Training Loss: 4.961590 Validation Loss: 4.919203
Epoch: 1 Training Accuracy: 0.007037 Validation Accuracy: 0.013174
Accuracy has increased (0.000000 --> 0.013174) Saving model as model_scratch_acc...
Validation loss decreased (inf --> 4.910386) Saving model as model_scratch_val...
Epoch: 2 Training Loss: 4.916895 Validation Loss: 4.887988
Epoch: 2 Training Accuracy: 0.010930 Validation Accuracy: 0.015569
Accuracy has increased (0.013174 --> 0.015569) Saving model as model_scratch_acc...
Validation loss decreased (4.910386 --> 4.883724) Saving model as model_scratch_val...
Epoch: 3 Training Loss: 4.899519 Validation Loss: 4.852184
Epoch: 3 Training Accuracy: 0.010630 Validation Accuracy: 0.016766
Accuracy has increased (0.015569 --> 0.016766) Saving model as model_scratch_acc...
Validation loss decreased (4.883724 --> 4.853319) Saving model as model_scratch_val...
Epoch: 4 Training Loss: 4.865728 Validation Loss: 4.833315
Epoch: 4 Training Accuracy: 0.015871 Validation Accuracy: 0.015569
Validation loss decreased (4.853319 --> 4.833284) Saving model as model_scratch_val...
Epoch: 5 Training Loss: 4.833236 Validation Loss: 4.808116
Epoch: 5 Training Accuracy: 0.017068 Validation Accuracy: 0.026347
Accuracy has increased (0.016766 --> 0.026347) Saving model as model_scratch_acc...
Validation loss decreased (4.833284 --> 4.811524) Saving model as model_scratch_val...
Epoch: 6 Training Loss: 4.810703 Validation Loss: 4.801436
Epoch: 6 Training Accuracy: 0.017817 Validation Accuracy: 0.021557
Validation loss decreased (4.811524 --> 4.792679) Saving model as model_scratch_val...
Epoch: 7 Training Loss: 4.787861 Validation Loss: 4.792127
Epoch: 7 Training Accuracy: 0.022758 Validation Accuracy: 0.022754
Validation loss decreased (4.792679 --> 4.779539) Saving model as model_scratch_val...
Epoch: 8 Training Loss: 4.767632 Validation Loss: 4.769209
Epoch: 8 Training Accuracy: 0.023507 Validation Accuracy: 0.026347
Validation loss decreased (4.779539 --> 4.762892) Saving model as model_scratch_val...
Epoch: 9 Training Loss: 4.738388 Validation Loss: 4.745911
Epoch: 9 Training Accuracy: 0.029046 Validation Accuracy: 0.031138
Accuracy has increased (0.026347 --> 0.031138) Saving model as model_scratch_acc...
Validation loss decreased (4.762892 --> 4.749212) Saving model as model_scratch_val...
Epoch: 10 Training Loss: 4.730064 Validation Loss: 4.723873
Epoch: 10 Training Accuracy: 0.027549 Validation Accuracy: 0.026347
Validation loss decreased (4.749212 --> 4.734878) Saving model as model_scratch_val...
Epoch: 11 Training Loss: 4.707745 Validation Loss: 4.721910
Epoch: 11 Training Accuracy: 0.028897 Validation Accuracy: 0.025150
Validation loss decreased (4.734878 --> 4.717407) Saving model as model_scratch_val...
Epoch: 12 Training Loss: 4.694037 Validation Loss: 4.694993
Epoch: 12 Training Accuracy: 0.033837 Validation Accuracy: 0.028743
Validation loss decreased (4.717407 --> 4.701353) Saving model as model_scratch_val...
Epoch: 13 Training Loss: 4.669737 Validation Loss: 4.695582
Epoch: 13 Training Accuracy: 0.034287 Validation Accuracy: 0.025150
Validation loss decreased (4.701353 --> 4.694014) Saving model as model_scratch_val...
Epoch: 14 Training Loss: 4.661658 Validation Loss: 4.669526
Epoch: 14 Training Accuracy: 0.033987 Validation Accuracy: 0.034731
Accuracy has increased (0.031138 --> 0.034731) Saving model as model_scratch_acc...
Validation loss decreased (4.694014 --> 4.678348) Saving model as model_scratch_val...
Epoch: 15 Training Loss: 4.642054 Validation Loss: 4.651555
Epoch: 15 Training Accuracy: 0.036832 Validation Accuracy: 0.034731
Validation loss decreased (4.678348 --> 4.658789) Saving model as model_scratch_val...
Epoch: 16 Training Loss: 4.623108 Validation Loss: 4.661414
Epoch: 16 Training Accuracy: 0.038928 Validation Accuracy: 0.032335
Validation loss decreased (4.658789 --> 4.649315) Saving model as model_scratch_val...
Epoch: 17 Training Loss: 4.616818 Validation Loss: 4.629588
Epoch: 17 Training Accuracy: 0.039377 Validation Accuracy: 0.034731
Validation loss decreased (4.649315 --> 4.634639) Saving model as model_scratch_val...
Epoch: 18 Training Loss: 4.593741 Validation Loss: 4.654449
Epoch: 18 Training Accuracy: 0.041773 Validation Accuracy: 0.037126
Accuracy has increased (0.034731 --> 0.037126) Saving model as model_scratch_acc...
Validation loss decreased (4.634639 --> 4.623248) Saving model as model_scratch_val...
Epoch: 19 Training Loss: 4.585801 Validation Loss: 4.602505
Epoch: 19 Training Accuracy: 0.042671 Validation Accuracy: 0.041916
Accuracy has increased (0.037126 --> 0.041916) Saving model as model_scratch_acc...
Validation loss decreased (4.623248 --> 4.607734) Saving model as model_scratch_val...
Epoch: 20 Training Loss: 4.566946 Validation Loss: 4.613661
Epoch: 20 Training Accuracy: 0.046714 Validation Accuracy: 0.039521
Validation loss decreased (4.607734 --> 4.599737) Saving model as model_scratch_val...
Epoch: 21 Training Loss: 4.550723 Validation Loss: 4.567756
Epoch: 21 Training Accuracy: 0.045366 Validation Accuracy: 0.038323
Validation loss decreased (4.599737 --> 4.578774) Saving model as model_scratch_val...
Epoch: 22 Training Loss: 4.538908 Validation Loss: 4.575675
Epoch: 22 Training Accuracy: 0.044767 Validation Accuracy: 0.044311
Accuracy has increased (0.041916 --> 0.044311) Saving model as model_scratch_acc...
Validation loss decreased (4.578774 --> 4.571023) Saving model as model_scratch_val...
Epoch: 23 Training Loss: 4.525758 Validation Loss: 4.562220
Epoch: 23 Training Accuracy: 0.049858 Validation Accuracy: 0.046707
Accuracy has increased (0.044311 --> 0.046707) Saving model as model_scratch_acc...
Validation loss decreased (4.571023 --> 4.561492) Saving model as model_scratch_val...
Epoch: 24 Training Loss: 4.517275 Validation Loss: 4.533606
Epoch: 24 Training Accuracy: 0.049558 Validation Accuracy: 0.051497
Accuracy has increased (0.046707 --> 0.051497) Saving model as model_scratch_acc...
Validation loss decreased (4.561492 --> 4.544248) Saving model as model_scratch_val...
Epoch: 25 Training Loss: 4.502127 Validation Loss: 4.542381
Epoch: 25 Training Accuracy: 0.049708 Validation Accuracy: 0.052695
Accuracy has increased (0.051497 --> 0.052695) Saving model as model_scratch_acc...
Validation loss decreased (4.544248 --> 4.538356) Saving model as model_scratch_val...
Epoch: 26 Training Loss: 4.494338 Validation Loss: 4.532583
Epoch: 26 Training Accuracy: 0.050007 Validation Accuracy: 0.058683
Accuracy has increased (0.052695 --> 0.058683) Saving model as model_scratch_acc...
Validation loss decreased (4.538356 --> 4.520129) Saving model as model_scratch_val...
Epoch: 27 Training Loss: 4.477561 Validation Loss: 4.495865
Epoch: 27 Training Accuracy: 0.047911 Validation Accuracy: 0.049102
Validation loss decreased (4.520129 --> 4.508329) Saving model as model_scratch_val...
Epoch: 28 Training Loss: 4.458783 Validation Loss: 4.506287
Epoch: 28 Training Accuracy: 0.055996 Validation Accuracy: 0.056287
Validation loss decreased (4.508329 --> 4.500564) Saving model as model_scratch_val...
Epoch: 29 Training Loss: 4.460913 Validation Loss: 4.486317
Epoch: 29 Training Accuracy: 0.057643 Validation Accuracy: 0.057485
Validation loss decreased (4.500564 --> 4.485874) Saving model as model_scratch_val...
Epoch: 30 Training Loss: 4.441758 Validation Loss: 4.479993
Epoch: 30 Training Accuracy: 0.057344 Validation Accuracy: 0.064671
Accuracy has increased (0.058683 --> 0.064671) Saving model as model_scratch_acc...
Validation loss decreased (4.485874 --> 4.467017) Saving model as model_scratch_val...
Epoch: 31 Training Loss: 4.433699 Validation Loss: 4.470017
Epoch: 31 Training Accuracy: 0.057044 Validation Accuracy: 0.062275
Validation loss decreased (4.467017 --> 4.455606) Saving model as model_scratch_val...
Epoch: 32 Training Loss: 4.414536 Validation Loss: 4.455982
Epoch: 32 Training Accuracy: 0.059590 Validation Accuracy: 0.064671
Validation loss decreased (4.455606 --> 4.438676) Saving model as model_scratch_val...
Epoch: 33 Training Loss: 4.412766 Validation Loss: 4.416891
Epoch: 33 Training Accuracy: 0.059889 Validation Accuracy: 0.065868
Accuracy has increased (0.064671 --> 0.065868) Saving model as model_scratch_acc...
Validation loss decreased (4.438676 --> 4.425948) Saving model as model_scratch_val...
Epoch: 34 Training Loss: 4.394729 Validation Loss: 4.416844
Epoch: 34 Training Accuracy: 0.062135 Validation Accuracy: 0.063473
Validation loss decreased (4.425948 --> 4.418158) Saving model as model_scratch_val...
Epoch: 35 Training Loss: 4.383926 Validation Loss: 4.420107
Epoch: 35 Training Accuracy: 0.060937 Validation Accuracy: 0.069461
Accuracy has increased (0.065868 --> 0.069461) Saving model as model_scratch_acc...
Validation loss decreased (4.418158 --> 4.415691) Saving model as model_scratch_val...
Epoch: 36 Training Loss: 4.381893 Validation Loss: 4.400412
Epoch: 36 Training Accuracy: 0.061536 Validation Accuracy: 0.067066
Validation loss decreased (4.415691 --> 4.401057) Saving model as model_scratch_val...
Epoch: 37 Training Loss: 4.362713 Validation Loss: 4.398118
Epoch: 37 Training Accuracy: 0.064531 Validation Accuracy: 0.065868
Validation loss decreased (4.401057 --> 4.387594) Saving model as model_scratch_val...
Epoch: 38 Training Loss: 4.353604 Validation Loss: 4.380332
Epoch: 38 Training Accuracy: 0.066776 Validation Accuracy: 0.070659
Accuracy has increased (0.069461 --> 0.070659) Saving model as model_scratch_acc...
Validation loss decreased (4.387594 --> 4.375505) Saving model as model_scratch_val...
Epoch: 39 Training Loss: 4.341194 Validation Loss: 4.360950
Epoch: 39 Training Accuracy: 0.064531 Validation Accuracy: 0.067066
Validation loss decreased (4.375505 --> 4.374117) Saving model as model_scratch_val...
Epoch: 40 Training Loss: 4.320673 Validation Loss: 4.373131
Epoch: 40 Training Accuracy: 0.066627 Validation Accuracy: 0.065868
Validation loss decreased (4.374117 --> 4.350564) Saving model as model_scratch_val...
Epoch: 41 Training Loss: 4.320134 Validation Loss: 4.347889
Epoch: 41 Training Accuracy: 0.072765 Validation Accuracy: 0.071856
Accuracy has increased (0.070659 --> 0.071856) Saving model as model_scratch_acc...
Epoch: 42 Training Loss: 4.312190 Validation Loss: 4.330526
Epoch: 42 Training Accuracy: 0.070370 Validation Accuracy: 0.075449
Accuracy has increased (0.071856 --> 0.075449) Saving model as model_scratch_acc...
Validation loss decreased (4.350564 --> 4.342403) Saving model as model_scratch_val...
Epoch: 43 Training Loss: 4.290071 Validation Loss: 4.336576
Epoch: 43 Training Accuracy: 0.077706 Validation Accuracy: 0.070659
Validation loss decreased (4.342403 --> 4.333960) Saving model as model_scratch_val...
Epoch: 44 Training Loss: 4.282736 Validation Loss: 4.314463
Epoch: 44 Training Accuracy: 0.071568 Validation Accuracy: 0.079042
Accuracy has increased (0.075449 --> 0.079042) Saving model as model_scratch_acc...
Validation loss decreased (4.333960 --> 4.317463) Saving model as model_scratch_val...
Epoch: 45 Training Loss: 4.269857 Validation Loss: 4.308572
Epoch: 45 Training Accuracy: 0.072915 Validation Accuracy: 0.081437
Accuracy has increased (0.079042 --> 0.081437) Saving model as model_scratch_acc...
Validation loss decreased (4.317463 --> 4.300952) Saving model as model_scratch_val...
Epoch: 46 Training Loss: 4.261287 Validation Loss: 4.287661
Epoch: 46 Training Accuracy: 0.078305 Validation Accuracy: 0.076647
Validation loss decreased (4.300952 --> 4.288138) Saving model as model_scratch_val...
Epoch: 47 Training Loss: 4.241984 Validation Loss: 4.285376
Epoch: 47 Training Accuracy: 0.082647 Validation Accuracy: 0.083832
Accuracy has increased (0.081437 --> 0.083832) Saving model as model_scratch_acc...
Validation loss decreased (4.288138 --> 4.283875) Saving model as model_scratch_val...
Epoch: 48 Training Loss: 4.248154 Validation Loss: 4.265169
Epoch: 48 Training Accuracy: 0.076808 Validation Accuracy: 0.089820
Accuracy has increased (0.083832 --> 0.089820) Saving model as model_scratch_acc...
Validation loss decreased (4.283875 --> 4.275203) Saving model as model_scratch_val...
Epoch: 49 Training Loss: 4.221133 Validation Loss: 4.234239
Epoch: 49 Training Accuracy: 0.089235 Validation Accuracy: 0.086228
Validation loss decreased (4.275203 --> 4.253997) Saving model as model_scratch_val...
Epoch: 50 Training Loss: 4.227645 Validation Loss: 4.225454
Epoch: 50 Training Accuracy: 0.080252 Validation Accuracy: 0.079042
Validation loss decreased (4.253997 --> 4.245283) Saving model as model_scratch_val...
Epoch: 51 Training Loss: 4.210372 Validation Loss: 4.242854
Epoch: 51 Training Accuracy: 0.082048 Validation Accuracy: 0.087425
Validation loss decreased (4.245283 --> 4.233133) Saving model as model_scratch_val...
Epoch: 52 Training Loss: 4.196611 Validation Loss: 4.255041
Epoch: 52 Training Accuracy: 0.083845 Validation Accuracy: 0.079042
Validation loss decreased (4.233133 --> 4.226763) Saving model as model_scratch_val...
Epoch: 53 Training Loss: 4.188484 Validation Loss: 4.217831
Epoch: 53 Training Accuracy: 0.085043 Validation Accuracy: 0.091018
Accuracy has increased (0.089820 --> 0.091018) Saving model as model_scratch_acc...
Validation loss decreased (4.226763 --> 4.216272) Saving model as model_scratch_val...
Epoch: 54 Training Loss: 4.182420 Validation Loss: 4.210487
Epoch: 54 Training Accuracy: 0.085642 Validation Accuracy: 0.089820
Validation loss decreased (4.216272 --> 4.208432) Saving model as model_scratch_val...
Epoch: 55 Training Loss: 4.164401 Validation Loss: 4.203396
Epoch: 55 Training Accuracy: 0.089534 Validation Accuracy: 0.086228
Validation loss decreased (4.208432 --> 4.195600) Saving model as model_scratch_val...
Epoch: 56 Training Loss: 4.148393 Validation Loss: 4.191838
Epoch: 56 Training Accuracy: 0.094176 Validation Accuracy: 0.085030
Validation loss decreased (4.195600 --> 4.191843) Saving model as model_scratch_val...
Epoch: 57 Training Loss: 4.146093 Validation Loss: 4.199538
Epoch: 57 Training Accuracy: 0.092978 Validation Accuracy: 0.088623
Validation loss decreased (4.191843 --> 4.176208) Saving model as model_scratch_val...
Epoch: 58 Training Loss: 4.135556 Validation Loss: 4.178194
Epoch: 58 Training Accuracy: 0.090133 Validation Accuracy: 0.088623
Validation loss decreased (4.176208 --> 4.158777) Saving model as model_scratch_val...
Epoch: 59 Training Loss: 4.125721 Validation Loss: 4.156998
Epoch: 59 Training Accuracy: 0.089235 Validation Accuracy: 0.088623
Validation loss decreased (4.158777 --> 4.156878) Saving model as model_scratch_val...
Epoch: 60 Training Loss: 4.127804 Validation Loss: 4.157080
Epoch: 60 Training Accuracy: 0.094475 Validation Accuracy: 0.089820
Validation loss decreased (4.156878 --> 4.145257) Saving model as model_scratch_val...
Epoch: 61 Training Loss: 4.107591 Validation Loss: 4.136007
Epoch: 61 Training Accuracy: 0.104207 Validation Accuracy: 0.087425
Validation loss decreased (4.145257 --> 4.131725) Saving model as model_scratch_val...
Epoch: 62 Training Loss: 4.087030 Validation Loss: 4.135820
Epoch: 62 Training Accuracy: 0.098368 Validation Accuracy: 0.094611
Accuracy has increased (0.091018 --> 0.094611) Saving model as model_scratch_acc...
Validation loss decreased (4.131725 --> 4.124839) Saving model as model_scratch_val...
Epoch: 63 Training Loss: 4.083349 Validation Loss: 4.103133
Epoch: 63 Training Accuracy: 0.097619 Validation Accuracy: 0.100599
Accuracy has increased (0.094611 --> 0.100599) Saving model as model_scratch_acc...
Validation loss decreased (4.124839 --> 4.118589) Saving model as model_scratch_val...
Epoch: 64 Training Loss: 4.082498 Validation Loss: 4.093785
Epoch: 64 Training Accuracy: 0.098967 Validation Accuracy: 0.088623
Validation loss decreased (4.118589 --> 4.107300) Saving model as model_scratch_val...
Epoch: 65 Training Loss: 4.063479 Validation Loss: 4.089746
Epoch: 65 Training Accuracy: 0.105255 Validation Accuracy: 0.097006
Validation loss decreased (4.107300 --> 4.095206) Saving model as model_scratch_val...
Epoch: 66 Training Loss: 4.047646 Validation Loss: 4.098458
Epoch: 66 Training Accuracy: 0.107651 Validation Accuracy: 0.097006
Validation loss decreased (4.095206 --> 4.089656) Saving model as model_scratch_val...
Epoch: 67 Training Loss: 4.052424 Validation Loss: 4.086577
Epoch: 67 Training Accuracy: 0.104357 Validation Accuracy: 0.095808
Validation loss decreased (4.089656 --> 4.069397) Saving model as model_scratch_val...
Epoch: 68 Training Loss: 4.044847 Validation Loss: 4.064115
Epoch: 68 Training Accuracy: 0.108399 Validation Accuracy: 0.098204
Validation loss decreased (4.069397 --> 4.067373) Saving model as model_scratch_val...
Epoch: 69 Training Loss: 4.023386 Validation Loss: 4.050792
Epoch: 69 Training Accuracy: 0.104656 Validation Accuracy: 0.101796
Accuracy has increased (0.100599 --> 0.101796) Saving model as model_scratch_acc...
Validation loss decreased (4.067373 --> 4.047521) Saving model as model_scratch_val...
Epoch: 70 Training Loss: 4.025335 Validation Loss: 4.032914
Epoch: 70 Training Accuracy: 0.105854 Validation Accuracy: 0.105389
Accuracy has increased (0.101796 --> 0.105389) Saving model as model_scratch_acc...
Validation loss decreased (4.047521 --> 4.037204) Saving model as model_scratch_val...
Epoch: 71 Training Loss: 4.004679 Validation Loss: 4.050064
Epoch: 71 Training Accuracy: 0.107501 Validation Accuracy: 0.097006
Validation loss decreased (4.037204 --> 4.035535) Saving model as model_scratch_val...
Epoch: 72 Training Loss: 4.009485 Validation Loss: 4.044646
Epoch: 72 Training Accuracy: 0.110645 Validation Accuracy: 0.108982
Accuracy has increased (0.105389 --> 0.108982) Saving model as model_scratch_acc...
Validation loss decreased (4.035535 --> 4.018070) Saving model as model_scratch_val...
Epoch: 73 Training Loss: 3.990971 Validation Loss: 4.020820
Epoch: 73 Training Accuracy: 0.113789 Validation Accuracy: 0.107784
Epoch: 74 Training Loss: 3.977431 Validation Loss: 4.019989
Epoch: 74 Training Accuracy: 0.111394 Validation Accuracy: 0.101796
Validation loss decreased (4.018070 --> 4.007282) Saving model as model_scratch_val...
Epoch: 75 Training Loss: 3.983584 Validation Loss: 3.976029
Epoch: 75 Training Accuracy: 0.112292 Validation Accuracy: 0.113772
Accuracy has increased (0.108982 --> 0.113772) Saving model as model_scratch_acc...
Validation loss decreased (4.007282 --> 3.984974) Saving model as model_scratch_val...
Epoch: 76 Training Loss: 3.971886 Validation Loss: 3.977511
Epoch: 76 Training Accuracy: 0.112442 Validation Accuracy: 0.108982
Epoch: 77 Training Loss: 3.952858 Validation Loss: 3.961319
Epoch: 77 Training Accuracy: 0.111394 Validation Accuracy: 0.107784
Validation loss decreased (3.984974 --> 3.974182) Saving model as model_scratch_val...
Epoch: 78 Training Loss: 3.959173 Validation Loss: 3.952832
Epoch: 78 Training Accuracy: 0.111993 Validation Accuracy: 0.116168
Accuracy has increased (0.113772 --> 0.116168) Saving model as model_scratch_acc...
Validation loss decreased (3.974182 --> 3.962397) Saving model as model_scratch_val...
Epoch: 79 Training Loss: 3.945993 Validation Loss: 3.956717
Epoch: 79 Training Accuracy: 0.121575 Validation Accuracy: 0.108982
Validation loss decreased (3.962397 --> 3.951301) Saving model as model_scratch_val...
Epoch: 80 Training Loss: 3.927968 Validation Loss: 3.978616
Epoch: 80 Training Accuracy: 0.122773 Validation Accuracy: 0.122156
Accuracy has increased (0.116168 --> 0.122156) Saving model as model_scratch_acc...
Epoch: 81 Training Loss: 3.924944 Validation Loss: 3.926396
Epoch: 81 Training Accuracy: 0.129810 Validation Accuracy: 0.118563
Validation loss decreased (3.951301 --> 3.937121) Saving model as model_scratch_val...
Epoch: 82 Training Loss: 3.919422 Validation Loss: 3.944021
Epoch: 82 Training Accuracy: 0.120677 Validation Accuracy: 0.113772
Validation loss decreased (3.937121 --> 3.931134) Saving model as model_scratch_val...
Epoch: 83 Training Loss: 3.927398 Validation Loss: 3.924654
Epoch: 83 Training Accuracy: 0.124719 Validation Accuracy: 0.123353
Accuracy has increased (0.122156 --> 0.123353) Saving model as model_scratch_acc...
Validation loss decreased (3.931134 --> 3.923072) Saving model as model_scratch_val...
Epoch: 84 Training Loss: 3.903913 Validation Loss: 3.922621
Epoch: 84 Training Accuracy: 0.125168 Validation Accuracy: 0.114970
Validation loss decreased (3.923072 --> 3.915552) Saving model as model_scratch_val...
Epoch: 85 Training Loss: 3.896400 Validation Loss: 3.904023
Epoch: 85 Training Accuracy: 0.126666 Validation Accuracy: 0.122156
Validation loss decreased (3.915552 --> 3.902635) Saving model as model_scratch_val...
Epoch: 86 Training Loss: 3.883279 Validation Loss: 3.913476
Epoch: 86 Training Accuracy: 0.126965 Validation Accuracy: 0.120958
Validation loss decreased (3.902635 --> 3.895025) Saving model as model_scratch_val...
Epoch: 87 Training Loss: 3.867261 Validation Loss: 3.907522
Epoch: 87 Training Accuracy: 0.125468 Validation Accuracy: 0.116168
Validation loss decreased (3.895025 --> 3.878712) Saving model as model_scratch_val...
Epoch: 88 Training Loss: 3.873288 Validation Loss: 3.871173
Epoch: 88 Training Accuracy: 0.130858 Validation Accuracy: 0.126946
Accuracy has increased (0.123353 --> 0.126946) Saving model as model_scratch_acc...
Epoch: 89 Training Loss: 3.864966 Validation Loss: 3.852593
Epoch: 89 Training Accuracy: 0.128762 Validation Accuracy: 0.134132
Accuracy has increased (0.126946 --> 0.134132) Saving model as model_scratch_acc...
Validation loss decreased (3.878712 --> 3.874895) Saving model as model_scratch_val...
Epoch: 90 Training Loss: 3.848657 Validation Loss: 3.883047
Epoch: 90 Training Accuracy: 0.134302 Validation Accuracy: 0.124551
Validation loss decreased (3.874895 --> 3.865834) Saving model as model_scratch_val...
Epoch: 91 Training Loss: 3.840996 Validation Loss: 3.861967
Epoch: 91 Training Accuracy: 0.135350 Validation Accuracy: 0.131737
Validation loss decreased (3.865834 --> 3.850665) Saving model as model_scratch_val...
Epoch: 92 Training Loss: 3.845536 Validation Loss: 3.840424
Epoch: 92 Training Accuracy: 0.134002 Validation Accuracy: 0.132934
Epoch: 93 Training Loss: 3.827234 Validation Loss: 3.818939
Epoch: 93 Training Accuracy: 0.138194 Validation Accuracy: 0.138922
Accuracy has increased (0.134132 --> 0.138922) Saving model as model_scratch_acc...
Validation loss decreased (3.850665 --> 3.833186) Saving model as model_scratch_val...
Epoch: 94 Training Loss: 3.834230 Validation Loss: 3.832601
Epoch: 94 Training Accuracy: 0.132655 Validation Accuracy: 0.136527
Epoch: 95 Training Loss: 3.806633 Validation Loss: 3.848658
Epoch: 95 Training Accuracy: 0.135649 Validation Accuracy: 0.129341
Validation loss decreased (3.833186 --> 3.820892) Saving model as model_scratch_val...
Epoch: 96 Training Loss: 3.800740 Validation Loss: 3.773103
Epoch: 96 Training Accuracy: 0.136547 Validation Accuracy: 0.134132
Validation loss decreased (3.820892 --> 3.804624) Saving model as model_scratch_val...
Epoch: 97 Training Loss: 3.797831 Validation Loss: 3.802944
Epoch: 97 Training Accuracy: 0.141788 Validation Accuracy: 0.136527
Epoch: 98 Training Loss: 3.793869 Validation Loss: 3.797077
Epoch: 98 Training Accuracy: 0.140740 Validation Accuracy: 0.141317
Accuracy has increased (0.138922 --> 0.141317) Saving model as model_scratch_acc...
Validation loss decreased (3.804624 --> 3.796730) Saving model as model_scratch_val...
Epoch: 99 Training Loss: 3.780378 Validation Loss: 3.818412
Epoch: 99 Training Accuracy: 0.141788 Validation Accuracy: 0.134132
Validation loss decreased (3.796730 --> 3.785231) Saving model as model_scratch_val...
Epoch: 100 Training Loss: 3.774595 Validation Loss: 3.801023
Epoch: 100 Training Accuracy: 0.139392 Validation Accuracy: 0.142515
Accuracy has increased (0.141317 --> 0.142515) Saving model as model_scratch_acc...
Epoch: 101 Training Loss: 3.778807 Validation Loss: 3.769093
Epoch: 101 Training Accuracy: 0.145381 Validation Accuracy: 0.132934
Validation loss decreased (3.785231 --> 3.772684) Saving model as model_scratch_val...
Epoch: 102 Training Loss: 3.763545 Validation Loss: 3.789487
Epoch: 102 Training Accuracy: 0.147777 Validation Accuracy: 0.141317
Validation loss decreased (3.772684 --> 3.766131) Saving model as model_scratch_val...
Epoch: 103 Training Loss: 3.769912 Validation Loss: 3.727435
Epoch: 103 Training Accuracy: 0.141039 Validation Accuracy: 0.152096
Accuracy has increased (0.142515 --> 0.152096) Saving model as model_scratch_acc...
Validation loss decreased (3.766131 --> 3.749369) Saving model as model_scratch_val...
Epoch: 104 Training Loss: 3.750626 Validation Loss: 3.754451
Epoch: 104 Training Accuracy: 0.157958 Validation Accuracy: 0.142515
Epoch: 105 Training Loss: 3.740579 Validation Loss: 3.747797
Epoch: 105 Training Accuracy: 0.144183 Validation Accuracy: 0.153293
Accuracy has increased (0.152096 --> 0.153293) Saving model as model_scratch_acc...
Validation loss decreased (3.749369 --> 3.741578) Saving model as model_scratch_val...
Epoch: 106 Training Loss: 3.732043 Validation Loss: 3.750707
Epoch: 106 Training Accuracy: 0.152717 Validation Accuracy: 0.146108
Validation loss decreased (3.741578 --> 3.734856) Saving model as model_scratch_val...
Epoch: 107 Training Loss: 3.727972 Validation Loss: 3.734976
Epoch: 107 Training Accuracy: 0.154963 Validation Accuracy: 0.152096
Epoch: 108 Training Loss: 3.732185 Validation Loss: 3.720115
Epoch: 108 Training Accuracy: 0.149124 Validation Accuracy: 0.154491
Accuracy has increased (0.153293 --> 0.154491) Saving model as model_scratch_acc...
Validation loss decreased (3.734856 --> 3.718266) Saving model as model_scratch_val...
Epoch: 109 Training Loss: 3.714134 Validation Loss: 3.745629
Epoch: 109 Training Accuracy: 0.155862 Validation Accuracy: 0.148503
Epoch: 110 Training Loss: 3.708722 Validation Loss: 3.724134
Epoch: 110 Training Accuracy: 0.154664 Validation Accuracy: 0.155689
Accuracy has increased (0.154491 --> 0.155689) Saving model as model_scratch_acc...
Validation loss decreased (3.718266 --> 3.712750) Saving model as model_scratch_val...
Epoch: 111 Training Loss: 3.701528 Validation Loss: 3.718828
Epoch: 111 Training Accuracy: 0.159156 Validation Accuracy: 0.158084
Accuracy has increased (0.155689 --> 0.158084) Saving model as model_scratch_acc...
Validation loss decreased (3.712750 --> 3.705527) Saving model as model_scratch_val...
Epoch: 112 Training Loss: 3.703289 Validation Loss: 3.709388
Epoch: 112 Training Accuracy: 0.148226 Validation Accuracy: 0.165269
Accuracy has increased (0.158084 --> 0.165269) Saving model as model_scratch_acc...
Validation loss decreased (3.705527 --> 3.702551) Saving model as model_scratch_val...
Epoch: 113 Training Loss: 3.685634 Validation Loss: 3.714809
Epoch: 113 Training Accuracy: 0.162000 Validation Accuracy: 0.159281
Validation loss decreased (3.702551 --> 3.693098) Saving model as model_scratch_val...
Epoch: 114 Training Loss: 3.687310 Validation Loss: 3.674237
Epoch: 114 Training Accuracy: 0.156311 Validation Accuracy: 0.152096
Validation loss decreased (3.693098 --> 3.679544) Saving model as model_scratch_val...
Epoch: 115 Training Loss: 3.679395 Validation Loss: 3.636306
Epoch: 115 Training Accuracy: 0.155263 Validation Accuracy: 0.161677
Epoch: 116 Training Loss: 3.673845 Validation Loss: 3.667615
Epoch: 116 Training Accuracy: 0.159904 Validation Accuracy: 0.164072
Validation loss decreased (3.679544 --> 3.677391) Saving model as model_scratch_val...
Epoch: 117 Training Loss: 3.660675 Validation Loss: 3.651525
Epoch: 117 Training Accuracy: 0.160353 Validation Accuracy: 0.170060
Accuracy has increased (0.165269 --> 0.170060) Saving model as model_scratch_acc...
Validation loss decreased (3.677391 --> 3.656473) Saving model as model_scratch_val...
Epoch: 118 Training Loss: 3.653746 Validation Loss: 3.664187
Epoch: 118 Training Accuracy: 0.162899 Validation Accuracy: 0.160479
Epoch: 119 Training Loss: 3.645944 Validation Loss: 3.597809
Epoch: 119 Training Accuracy: 0.173080 Validation Accuracy: 0.168862
Validation loss decreased (3.656473 --> 3.649102) Saving model as model_scratch_val...
Epoch: 120 Training Loss: 3.659712 Validation Loss: 3.636005
Epoch: 120 Training Accuracy: 0.160952 Validation Accuracy: 0.167665
Validation loss decreased (3.649102 --> 3.630691) Saving model as model_scratch_val...
Epoch: 121 Training Loss: 3.644701 Validation Loss: 3.623431
Epoch: 121 Training Accuracy: 0.162599 Validation Accuracy: 0.170060
Validation loss decreased (3.630691 --> 3.630185) Saving model as model_scratch_val...
Epoch: 122 Training Loss: 3.645003 Validation Loss: 3.612422
Epoch: 122 Training Accuracy: 0.167989 Validation Accuracy: 0.167665
Validation loss decreased (3.630185 --> 3.627559) Saving model as model_scratch_val...
Epoch: 123 Training Loss: 3.636429 Validation Loss: 3.609053
Epoch: 123 Training Accuracy: 0.161551 Validation Accuracy: 0.170060
Validation loss decreased (3.627559 --> 3.612759) Saving model as model_scratch_val...
Epoch: 124 Training Loss: 3.614089 Validation Loss: 3.653380
Epoch: 124 Training Accuracy: 0.164396 Validation Accuracy: 0.168862
Epoch: 125 Training Loss: 3.625476 Validation Loss: 3.650490
Epoch: 125 Training Accuracy: 0.163647 Validation Accuracy: 0.180838
Accuracy has increased (0.170060 --> 0.180838) Saving model as model_scratch_acc...
Epoch: 126 Training Loss: 3.604398 Validation Loss: 3.594820
Epoch: 126 Training Accuracy: 0.171283 Validation Accuracy: 0.176048
Validation loss decreased (3.612759 --> 3.612396) Saving model as model_scratch_val...
Epoch: 127 Training Loss: 3.595028 Validation Loss: 3.577291
Epoch: 127 Training Accuracy: 0.175775 Validation Accuracy: 0.179641
Validation loss decreased (3.612396 --> 3.579763) Saving model as model_scratch_val...
Epoch: 128 Training Loss: 3.599304 Validation Loss: 3.595695
Epoch: 128 Training Accuracy: 0.172481 Validation Accuracy: 0.173653
Epoch: 129 Training Loss: 3.591354 Validation Loss: 3.586159
Epoch: 129 Training Accuracy: 0.168738 Validation Accuracy: 0.180838
Epoch: 130 Training Loss: 3.577348 Validation Loss: 3.616155
Epoch: 130 Training Accuracy: 0.176673 Validation Accuracy: 0.183234
Accuracy has increased (0.180838 --> 0.183234) Saving model as model_scratch_acc...
Epoch: 131 Training Loss: 3.567074 Validation Loss: 3.531819
Epoch: 131 Training Accuracy: 0.179967 Validation Accuracy: 0.177246
Validation loss decreased (3.579763 --> 3.568812) Saving model as model_scratch_val...
Epoch: 132 Training Loss: 3.564107 Validation Loss: 3.577753
Epoch: 132 Training Accuracy: 0.175625 Validation Accuracy: 0.183234
Validation loss decreased (3.568812 --> 3.566189) Saving model as model_scratch_val...
Epoch: 133 Training Loss: 3.550288 Validation Loss: 3.554419
Epoch: 133 Training Accuracy: 0.177122 Validation Accuracy: 0.188024
Accuracy has increased (0.183234 --> 0.188024) Saving model as model_scratch_acc...
Validation loss decreased (3.566189 --> 3.564850) Saving model as model_scratch_val...
Epoch: 134 Training Loss: 3.567092 Validation Loss: 3.600030
Epoch: 134 Training Accuracy: 0.176673 Validation Accuracy: 0.174850
Validation loss decreased (3.564850 --> 3.555997) Saving model as model_scratch_val...
Epoch: 135 Training Loss: 3.552739 Validation Loss: 3.551371
Epoch: 135 Training Accuracy: 0.175775 Validation Accuracy: 0.182036
Epoch: 136 Training Loss: 3.557916 Validation Loss: 3.565541
Epoch: 136 Training Accuracy: 0.175326 Validation Accuracy: 0.190419
Accuracy has increased (0.188024 --> 0.190419) Saving model as model_scratch_acc...
Validation loss decreased (3.555997 --> 3.549866) Saving model as model_scratch_val...
Epoch: 137 Training Loss: 3.545806 Validation Loss: 3.531090
Epoch: 137 Training Accuracy: 0.173379 Validation Accuracy: 0.179641
Validation loss decreased (3.549866 --> 3.531282) Saving model as model_scratch_val...
Epoch: 138 Training Loss: 3.527649 Validation Loss: 3.551901
Epoch: 138 Training Accuracy: 0.183860 Validation Accuracy: 0.194012
Accuracy has increased (0.190419 --> 0.194012) Saving model as model_scratch_acc...
Epoch: 139 Training Loss: 3.518505 Validation Loss: 3.516140
Epoch: 139 Training Accuracy: 0.187154 Validation Accuracy: 0.183234
Validation loss decreased (3.531282 --> 3.528402) Saving model as model_scratch_val...
Epoch: 140 Training Loss: 3.535452 Validation Loss: 3.518444
Epoch: 140 Training Accuracy: 0.183411 Validation Accuracy: 0.184431
Validation loss decreased (3.528402 --> 3.522587) Saving model as model_scratch_val...
Epoch: 141 Training Loss: 3.517627 Validation Loss: 3.484757
Epoch: 141 Training Accuracy: 0.182063 Validation Accuracy: 0.191617
Validation loss decreased (3.522587 --> 3.508568) Saving model as model_scratch_val...
Epoch: 142 Training Loss: 3.512470 Validation Loss: 3.510522
Epoch: 142 Training Accuracy: 0.184010 Validation Accuracy: 0.190419
Validation loss decreased (3.508568 --> 3.502295) Saving model as model_scratch_val...
Epoch: 143 Training Loss: 3.510679 Validation Loss: 3.526705
Epoch: 143 Training Accuracy: 0.187902 Validation Accuracy: 0.191617
Epoch: 144 Training Loss: 3.510440 Validation Loss: 3.487937
Epoch: 144 Training Accuracy: 0.186255 Validation Accuracy: 0.196407
Accuracy has increased (0.194012 --> 0.196407) Saving model as model_scratch_acc...
Validation loss decreased (3.502295 --> 3.498636) Saving model as model_scratch_val...
Epoch: 145 Training Loss: 3.496385 Validation Loss: 3.496437
Epoch: 145 Training Accuracy: 0.187154 Validation Accuracy: 0.198802
Accuracy has increased (0.196407 --> 0.198802) Saving model as model_scratch_acc...
Validation loss decreased (3.498636 --> 3.493196) Saving model as model_scratch_val...
Epoch: 146 Training Loss: 3.501367 Validation Loss: 3.448648
Epoch: 146 Training Accuracy: 0.188651 Validation Accuracy: 0.198802
Validation loss decreased (3.493196 --> 3.488006) Saving model as model_scratch_val...
Epoch: 147 Training Loss: 3.495651 Validation Loss: 3.468408
Epoch: 147 Training Accuracy: 0.188501 Validation Accuracy: 0.207186
Accuracy has increased (0.198802 --> 0.207186) Saving model as model_scratch_acc...
Validation loss decreased (3.488006 --> 3.476880) Saving model as model_scratch_val...
Epoch: 148 Training Loss: 3.479345 Validation Loss: 3.488336
Epoch: 148 Training Accuracy: 0.187154 Validation Accuracy: 0.200000
Validation loss decreased (3.476880 --> 3.473121) Saving model as model_scratch_val...
Epoch: 149 Training Loss: 3.471785 Validation Loss: 3.481541
Epoch: 149 Training Accuracy: 0.192843 Validation Accuracy: 0.196407
Epoch: 150 Training Loss: 3.468506 Validation Loss: 3.451855
Epoch: 150 Training Accuracy: 0.185507 Validation Accuracy: 0.197605
Validation loss decreased (3.473121 --> 3.469247) Saving model as model_scratch_val...
Epoch: 151 Training Loss: 3.477448 Validation Loss: 3.450499
Epoch: 151 Training Accuracy: 0.186405 Validation Accuracy: 0.213174
Accuracy has increased (0.207186 --> 0.213174) Saving model as model_scratch_acc...
Validation loss decreased (3.469247 --> 3.460272) Saving model as model_scratch_val...
Epoch: 152 Training Loss: 3.460494 Validation Loss: 3.480053
Epoch: 152 Training Accuracy: 0.186405 Validation Accuracy: 0.202395
Validation loss decreased (3.460272 --> 3.457204) Saving model as model_scratch_val...
Epoch: 153 Training Loss: 3.460124 Validation Loss: 3.442125
Epoch: 153 Training Accuracy: 0.188651 Validation Accuracy: 0.197605
Validation loss decreased (3.457204 --> 3.441875) Saving model as model_scratch_val...
Epoch: 154 Training Loss: 3.462604 Validation Loss: 3.445044
Epoch: 154 Training Accuracy: 0.188950 Validation Accuracy: 0.201198
Epoch: 155 Training Loss: 3.455742 Validation Loss: 3.438510
Epoch: 155 Training Accuracy: 0.196437 Validation Accuracy: 0.192814
Validation loss decreased (3.441875 --> 3.429765) Saving model as model_scratch_val...
Epoch: 156 Training Loss: 3.444303 Validation Loss: 3.428447
Epoch: 156 Training Accuracy: 0.190448 Validation Accuracy: 0.207186
Validation loss decreased (3.429765 --> 3.426168) Saving model as model_scratch_val...
Epoch: 157 Training Loss: 3.439654 Validation Loss: 3.417716
Epoch: 157 Training Accuracy: 0.192095 Validation Accuracy: 0.197605
Epoch: 158 Training Loss: 3.418483 Validation Loss: 3.466569
Epoch: 158 Training Accuracy: 0.204372 Validation Accuracy: 0.203593
Validation loss decreased (3.426168 --> 3.424753) Saving model as model_scratch_val...
Epoch: 159 Training Loss: 3.414056 Validation Loss: 3.466969
Epoch: 159 Training Accuracy: 0.197934 Validation Accuracy: 0.200000
Validation loss decreased (3.424753 --> 3.414974) Saving model as model_scratch_val...
Epoch: 160 Training Loss: 3.422180 Validation Loss: 3.366662
Epoch: 160 Training Accuracy: 0.192244 Validation Accuracy: 0.205988
Validation loss decreased (3.414974 --> 3.403102) Saving model as model_scratch_val...
Epoch: 161 Training Loss: 3.398428 Validation Loss: 3.381595
Epoch: 161 Training Accuracy: 0.203773 Validation Accuracy: 0.202395
Validation loss decreased (3.403102 --> 3.398518) Saving model as model_scratch_val...
Epoch: 162 Training Loss: 3.405592 Validation Loss: 3.443935
Epoch: 162 Training Accuracy: 0.205121 Validation Accuracy: 0.202395
Epoch: 163 Training Loss: 3.405982 Validation Loss: 3.396415
Epoch: 163 Training Accuracy: 0.202276 Validation Accuracy: 0.210778
Epoch: 164 Training Loss: 3.391732 Validation Loss: 3.456382
Epoch: 164 Training Accuracy: 0.207067 Validation Accuracy: 0.211976
Validation loss decreased (3.398518 --> 3.387149) Saving model as model_scratch_val...
Epoch: 165 Training Loss: 3.400285 Validation Loss: 3.414183
Epoch: 165 Training Accuracy: 0.198084 Validation Accuracy: 0.211976
Validation loss decreased (3.387149 --> 3.384973) Saving model as model_scratch_val...
Epoch: 166 Training Loss: 3.388603 Validation Loss: 3.394268
Epoch: 166 Training Accuracy: 0.204971 Validation Accuracy: 0.204790
Validation loss decreased (3.384973 --> 3.382701) Saving model as model_scratch_val...
Epoch: 167 Training Loss: 3.382042 Validation Loss: 3.410890
Epoch: 167 Training Accuracy: 0.209163 Validation Accuracy: 0.205988
Validation loss decreased (3.382701 --> 3.380821) Saving model as model_scratch_val...
Epoch: 168 Training Loss: 3.367097 Validation Loss: 3.364518
Epoch: 168 Training Accuracy: 0.209612 Validation Accuracy: 0.219162
Accuracy has increased (0.213174 --> 0.219162) Saving model as model_scratch_acc...
Validation loss decreased (3.380821 --> 3.376894) Saving model as model_scratch_val...
Epoch: 169 Training Loss: 3.374738 Validation Loss: 3.404622
Epoch: 169 Training Accuracy: 0.197035 Validation Accuracy: 0.211976
Validation loss decreased (3.376894 --> 3.366131) Saving model as model_scratch_val...
Epoch: 170 Training Loss: 3.372807 Validation Loss: 3.369359
Epoch: 170 Training Accuracy: 0.206468 Validation Accuracy: 0.202395
Validation loss decreased (3.366131 --> 3.364498) Saving model as model_scratch_val...
Epoch: 171 Training Loss: 3.346749 Validation Loss: 3.344267
Epoch: 171 Training Accuracy: 0.212607 Validation Accuracy: 0.208383
Validation loss decreased (3.364498 --> 3.353850) Saving model as model_scratch_val...
Epoch: 172 Training Loss: 3.344807 Validation Loss: 3.406757
Epoch: 172 Training Accuracy: 0.211708 Validation Accuracy: 0.207186
Epoch: 173 Training Loss: 3.351054 Validation Loss: 3.333519
Epoch: 173 Training Accuracy: 0.215002 Validation Accuracy: 0.207186
Validation loss decreased (3.353850 --> 3.344182) Saving model as model_scratch_val...
Epoch: 174 Training Loss: 3.331668 Validation Loss: 3.358231
Epoch: 174 Training Accuracy: 0.210960 Validation Accuracy: 0.208383
Epoch: 175 Training Loss: 3.331441 Validation Loss: 3.319122
Epoch: 175 Training Accuracy: 0.210061 Validation Accuracy: 0.208383
Validation loss decreased (3.344182 --> 3.337987) Saving model as model_scratch_val...
Epoch: 176 Training Loss: 3.333740 Validation Loss: 3.330969
Epoch: 176 Training Accuracy: 0.219045 Validation Accuracy: 0.213174
Validation loss decreased (3.337987 --> 3.333484) Saving model as model_scratch_val...
Epoch: 177 Training Loss: 3.337514 Validation Loss: 3.299802
Epoch: 177 Training Accuracy: 0.209912 Validation Accuracy: 0.214371
Validation loss decreased (3.333484 --> 3.325973) Saving model as model_scratch_val...
Epoch: 178 Training Loss: 3.342118 Validation Loss: 3.341868
Epoch: 178 Training Accuracy: 0.208714 Validation Accuracy: 0.205988
Validation loss decreased (3.325973 --> 3.323620) Saving model as model_scratch_val...
Epoch: 179 Training Loss: 3.335632 Validation Loss: 3.342526
Epoch: 179 Training Accuracy: 0.211708 Validation Accuracy: 0.204790
Validation loss decreased (3.323620 --> 3.319835) Saving model as model_scratch_val...
Epoch: 180 Training Loss: 3.311182 Validation Loss: 3.325745
Epoch: 180 Training Accuracy: 0.212906 Validation Accuracy: 0.211976
Validation loss decreased (3.319835 --> 3.314347) Saving model as model_scratch_val...
Epoch: 181 Training Loss: 3.324093 Validation Loss: 3.277863
Epoch: 181 Training Accuracy: 0.218296 Validation Accuracy: 0.226347
Accuracy has increased (0.219162 --> 0.226347) Saving model as model_scratch_acc...
Validation loss decreased (3.314347 --> 3.301194) Saving model as model_scratch_val...
Epoch: 182 Training Loss: 3.312094 Validation Loss: 3.303442
Epoch: 182 Training Accuracy: 0.214553 Validation Accuracy: 0.208383
Epoch: 183 Training Loss: 3.303096 Validation Loss: 3.300206
Epoch: 183 Training Accuracy: 0.217248 Validation Accuracy: 0.219162
Epoch: 184 Training Loss: 3.291498 Validation Loss: 3.298159
Epoch: 184 Training Accuracy: 0.221141 Validation Accuracy: 0.203593
Validation loss decreased (3.301194 --> 3.288874) Saving model as model_scratch_val...
Epoch: 185 Training Loss: 3.296758 Validation Loss: 3.273988
Epoch: 185 Training Accuracy: 0.220243 Validation Accuracy: 0.211976
Validation loss decreased (3.288874 --> 3.283160) Saving model as model_scratch_val...
Epoch: 186 Training Loss: 3.294346 Validation Loss: 3.313086
Epoch: 186 Training Accuracy: 0.217847 Validation Accuracy: 0.216766
Epoch: 187 Training Loss: 3.282319 Validation Loss: 3.282534
Epoch: 187 Training Accuracy: 0.224585 Validation Accuracy: 0.210778
Epoch: 188 Training Loss: 3.273715 Validation Loss: 3.307482
Epoch: 188 Training Accuracy: 0.228926 Validation Accuracy: 0.210778
Validation loss decreased (3.283160 --> 3.278415) Saving model as model_scratch_val...
Epoch: 189 Training Loss: 3.283456 Validation Loss: 3.252257
Epoch: 189 Training Accuracy: 0.228926 Validation Accuracy: 0.209581
Validation loss decreased (3.278415 --> 3.275857) Saving model as model_scratch_val...
Epoch: 190 Training Loss: 3.281767 Validation Loss: 3.248670
Epoch: 190 Training Accuracy: 0.222189 Validation Accuracy: 0.229940
Accuracy has increased (0.226347 --> 0.229940) Saving model as model_scratch_acc...
Validation loss decreased (3.275857 --> 3.272370) Saving model as model_scratch_val...
Epoch: 191 Training Loss: 3.269533 Validation Loss: 3.261844
Epoch: 191 Training Accuracy: 0.229675 Validation Accuracy: 0.210778
Epoch: 192 Training Loss: 3.235482 Validation Loss: 3.244784
Epoch: 192 Training Accuracy: 0.231622 Validation Accuracy: 0.202395
Epoch: 193 Training Loss: 3.255410 Validation Loss: 3.213551
Epoch: 193 Training Accuracy: 0.229825 Validation Accuracy: 0.217964
Validation loss decreased (3.272370 --> 3.252377) Saving model as model_scratch_val...
Epoch: 194 Training Loss: 3.253254 Validation Loss: 3.207265
Epoch: 194 Training Accuracy: 0.227878 Validation Accuracy: 0.221557
Validation loss decreased (3.252377 --> 3.249375) Saving model as model_scratch_val...
Epoch: 195 Training Loss: 3.230494 Validation Loss: 3.222841
Epoch: 195 Training Accuracy: 0.232670 Validation Accuracy: 0.217964
Epoch: 196 Training Loss: 3.228616 Validation Loss: 3.230145
Epoch: 196 Training Accuracy: 0.229376 Validation Accuracy: 0.219162
Validation loss decreased (3.249375 --> 3.245753) Saving model as model_scratch_val...
Epoch: 197 Training Loss: 3.230228 Validation Loss: 3.237736
Epoch: 197 Training Accuracy: 0.230424 Validation Accuracy: 0.222754
Validation loss decreased (3.245753 --> 3.240042) Saving model as model_scratch_val...
Epoch: 198 Training Loss: 3.241547 Validation Loss: 3.224800
Epoch: 198 Training Accuracy: 0.223536 Validation Accuracy: 0.211976
Validation loss decreased (3.240042 --> 3.239954) Saving model as model_scratch_val...
Epoch: 199 Training Loss: 3.227425 Validation Loss: 3.210567
Epoch: 199 Training Accuracy: 0.235365 Validation Accuracy: 0.221557
Validation loss decreased (3.239954 --> 3.224857) Saving model as model_scratch_val...
Epoch: 200 Training Loss: 3.241978 Validation Loss: 3.189444
Epoch: 200 Training Accuracy: 0.223087 Validation Accuracy: 0.220359
Epoch: 201 Training Loss: 3.227196 Validation Loss: 3.229254
Epoch: 201 Training Accuracy: 0.232071 Validation Accuracy: 0.226347
Validation loss decreased (3.224857 --> 3.215815) Saving model as model_scratch_val...
Epoch: 202 Training Loss: 3.213295 Validation Loss: 3.201240
Epoch: 202 Training Accuracy: 0.233268 Validation Accuracy: 0.214371
Epoch: 203 Training Loss: 3.209830 Validation Loss: 3.228813
Epoch: 203 Training Accuracy: 0.228627 Validation Accuracy: 0.214371
Epoch: 204 Training Loss: 3.205185 Validation Loss: 3.192276
Epoch: 204 Training Accuracy: 0.233568 Validation Accuracy: 0.211976
Epoch: 205 Training Loss: 3.197230 Validation Loss: 3.204510
Epoch: 205 Training Accuracy: 0.231322 Validation Accuracy: 0.216766
Validation loss decreased (3.215815 --> 3.210307) Saving model as model_scratch_val...
Epoch: 206 Training Loss: 3.209473 Validation Loss: 3.192506
Epoch: 206 Training Accuracy: 0.237760 Validation Accuracy: 0.225150
Epoch: 207 Training Loss: 3.179350 Validation Loss: 3.229929
Epoch: 207 Training Accuracy: 0.235365 Validation Accuracy: 0.226347
Validation loss decreased (3.210307 --> 3.197178) Saving model as model_scratch_val...
Epoch: 208 Training Loss: 3.163344 Validation Loss: 3.209142
Epoch: 208 Training Accuracy: 0.242551 Validation Accuracy: 0.214371
Validation loss decreased (3.197178 --> 3.191626) Saving model as model_scratch_val...
Epoch: 209 Training Loss: 3.175554 Validation Loss: 3.183289
Epoch: 209 Training Accuracy: 0.241204 Validation Accuracy: 0.237126
Accuracy has increased (0.229940 --> 0.237126) Saving model as model_scratch_acc...
Epoch: 210 Training Loss: 3.184031 Validation Loss: 3.160475
Epoch: 210 Training Accuracy: 0.231322 Validation Accuracy: 0.221557
Epoch: 211 Training Loss: 3.178214 Validation Loss: 3.117187
Epoch: 211 Training Accuracy: 0.235963 Validation Accuracy: 0.234731
Validation loss decreased (3.191626 --> 3.162499) Saving model as model_scratch_val...
Epoch: 212 Training Loss: 3.167099 Validation Loss: 3.153887
Epoch: 212 Training Accuracy: 0.243899 Validation Accuracy: 0.229940
Epoch: 213 Training Loss: 3.165128 Validation Loss: 3.150539
Epoch: 213 Training Accuracy: 0.241204 Validation Accuracy: 0.227545
Validation loss decreased (3.162499 --> 3.154212) Saving model as model_scratch_val...
Epoch: 214 Training Loss: 3.176518 Validation Loss: 3.218671
Epoch: 214 Training Accuracy: 0.237610 Validation Accuracy: 0.227545
Epoch: 215 Training Loss: 3.166126 Validation Loss: 3.131978
Epoch: 215 Training Accuracy: 0.236712 Validation Accuracy: 0.234731
Epoch: 216 Training Loss: 3.164419 Validation Loss: 3.120225
Epoch: 216 Training Accuracy: 0.239257 Validation Accuracy: 0.227545
Epoch: 217 Training Loss: 3.159211 Validation Loss: 3.144697
Epoch: 217 Training Accuracy: 0.246594 Validation Accuracy: 0.231138
Validation loss decreased (3.154212 --> 3.144161) Saving model as model_scratch_val...
Epoch: 218 Training Loss: 3.156017 Validation Loss: 3.134391
Epoch: 218 Training Accuracy: 0.246444 Validation Accuracy: 0.229940
Epoch: 219 Training Loss: 3.144932 Validation Loss: 3.127830
Epoch: 219 Training Accuracy: 0.244647 Validation Accuracy: 0.235928
Epoch: 220 Training Loss: 3.142137 Validation Loss: 3.163154
Epoch: 220 Training Accuracy: 0.243000 Validation Accuracy: 0.237126
Epoch: 221 Training Loss: 3.136135 Validation Loss: 3.117907
Epoch: 221 Training Accuracy: 0.247792 Validation Accuracy: 0.228743
Validation loss decreased (3.144161 --> 3.138102) Saving model as model_scratch_val...
Epoch: 222 Training Loss: 3.130488 Validation Loss: 3.135773
Epoch: 222 Training Accuracy: 0.252433 Validation Accuracy: 0.229940
Validation loss decreased (3.138102 --> 3.130898) Saving model as model_scratch_val...
Epoch: 223 Training Loss: 3.128628 Validation Loss: 3.115881
Epoch: 223 Training Accuracy: 0.241204 Validation Accuracy: 0.231138
Validation loss decreased (3.130898 --> 3.123908) Saving model as model_scratch_val...
Epoch: 224 Training Loss: 3.124078 Validation Loss: 3.170024
Epoch: 224 Training Accuracy: 0.251235 Validation Accuracy: 0.240719
Accuracy has increased (0.237126 --> 0.240719) Saving model as model_scratch_acc...
Epoch: 225 Training Loss: 3.106574 Validation Loss: 3.129433
Epoch: 225 Training Accuracy: 0.250936 Validation Accuracy: 0.239521
Validation loss decreased (3.123908 --> 3.121264) Saving model as model_scratch_val...
Epoch: 226 Training Loss: 3.110851 Validation Loss: 3.164701
Epoch: 226 Training Accuracy: 0.249439 Validation Accuracy: 0.244311
Accuracy has increased (0.240719 --> 0.244311) Saving model as model_scratch_acc...
Epoch: 227 Training Loss: 3.132787 Validation Loss: 3.082833
Epoch: 227 Training Accuracy: 0.239257 Validation Accuracy: 0.252695
Accuracy has increased (0.244311 --> 0.252695) Saving model as model_scratch_acc...
Validation loss decreased (3.121264 --> 3.109714) Saving model as model_scratch_val...
Epoch: 228 Training Loss: 3.115636 Validation Loss: 3.117486
Epoch: 228 Training Accuracy: 0.247342 Validation Accuracy: 0.241916
Validation loss decreased (3.109714 --> 3.108695) Saving model as model_scratch_val...
Epoch: 229 Training Loss: 3.100663 Validation Loss: 3.065283
Epoch: 229 Training Accuracy: 0.251834 Validation Accuracy: 0.245509
Validation loss decreased (3.108695 --> 3.108042) Saving model as model_scratch_val...
Epoch: 230 Training Loss: 3.081341 Validation Loss: 3.121293
Epoch: 230 Training Accuracy: 0.255128 Validation Accuracy: 0.243114
Validation loss decreased (3.108042 --> 3.103988) Saving model as model_scratch_val...
Epoch: 231 Training Loss: 3.087677 Validation Loss: 3.054191
Epoch: 231 Training Accuracy: 0.256775 Validation Accuracy: 0.245509
Validation loss decreased (3.103988 --> 3.103697) Saving model as model_scratch_val...
Epoch: 232 Training Loss: 3.096594 Validation Loss: 3.091248
Epoch: 232 Training Accuracy: 0.255427 Validation Accuracy: 0.244311
Validation loss decreased (3.103697 --> 3.095810) Saving model as model_scratch_val...
Epoch: 233 Training Loss: 3.084939 Validation Loss: 3.122943
Epoch: 233 Training Accuracy: 0.259320 Validation Accuracy: 0.235928
Validation loss decreased (3.095810 --> 3.093107) Saving model as model_scratch_val...
Epoch: 234 Training Loss: 3.071101 Validation Loss: 3.088638
Epoch: 234 Training Accuracy: 0.268154 Validation Accuracy: 0.243114
Validation loss decreased (3.093107 --> 3.084819) Saving model as model_scratch_val...
Epoch: 235 Training Loss: 3.090981 Validation Loss: 3.101937
Epoch: 235 Training Accuracy: 0.259769 Validation Accuracy: 0.241916
Validation loss decreased (3.084819 --> 3.079987) Saving model as model_scratch_val...
Epoch: 236 Training Loss: 3.056785 Validation Loss: 3.066923
Epoch: 236 Training Accuracy: 0.257973 Validation Accuracy: 0.235928
Validation loss decreased (3.079987 --> 3.074361) Saving model as model_scratch_val...
Epoch: 237 Training Loss: 3.088672 Validation Loss: 3.036625
Epoch: 237 Training Accuracy: 0.262764 Validation Accuracy: 0.259880
Accuracy has increased (0.252695 --> 0.259880) Saving model as model_scratch_acc...
Epoch: 238 Training Loss: 3.047379 Validation Loss: 3.082693
Epoch: 238 Training Accuracy: 0.266956 Validation Accuracy: 0.255090
Validation loss decreased (3.074361 --> 3.060797) Saving model as model_scratch_val...
Epoch: 239 Training Loss: 3.053712 Validation Loss: 3.044106
Epoch: 239 Training Accuracy: 0.257673 Validation Accuracy: 0.247904
Epoch: 240 Training Loss: 3.045533 Validation Loss: 3.044294
Epoch: 240 Training Accuracy: 0.259171 Validation Accuracy: 0.264671
Accuracy has increased (0.259880 --> 0.264671) Saving model as model_scratch_acc...
Validation loss decreased (3.060797 --> 3.045634) Saving model as model_scratch_val...
Epoch: 241 Training Loss: 3.039705 Validation Loss: 3.112259
Epoch: 241 Training Accuracy: 0.268154 Validation Accuracy: 0.243114
Epoch: 242 Training Loss: 3.053699 Validation Loss: 3.025354
Epoch: 242 Training Accuracy: 0.264261 Validation Accuracy: 0.245509
Epoch: 243 Training Loss: 3.046883 Validation Loss: 3.046190
Epoch: 243 Training Accuracy: 0.257524 Validation Accuracy: 0.251497
Epoch: 244 Training Loss: 3.049020 Validation Loss: 3.060204
Epoch: 244 Training Accuracy: 0.260668 Validation Accuracy: 0.253892
Epoch: 245 Training Loss: 3.052105 Validation Loss: 3.027340
Epoch: 245 Training Accuracy: 0.258272 Validation Accuracy: 0.255090
Validation loss decreased (3.045634 --> 3.020443) Saving model as model_scratch_val...
Epoch: 246 Training Loss: 3.034375 Validation Loss: 3.007017
Epoch: 246 Training Accuracy: 0.264710 Validation Accuracy: 0.252695
Epoch: 247 Training Loss: 3.022372 Validation Loss: 3.020182
Epoch: 247 Training Accuracy: 0.270549 Validation Accuracy: 0.253892
Epoch: 248 Training Loss: 3.020650 Validation Loss: 3.004272
Epoch: 248 Training Accuracy: 0.269801 Validation Accuracy: 0.252695
Epoch: 249 Training Loss: 3.015954 Validation Loss: 3.038483
Epoch: 249 Training Accuracy: 0.269501 Validation Accuracy: 0.251497
Epoch: 250 Training Loss: 3.037405 Validation Loss: 3.033498
Epoch: 250 Training Accuracy: 0.263662 Validation Accuracy: 0.252695
Epoch: 251 Training Loss: 3.018105 Validation Loss: 3.023722
Epoch: 251 Training Accuracy: 0.271148 Validation Accuracy: 0.262275
Epoch: 252 Training Loss: 2.997476 Validation Loss: 2.989423
Epoch: 252 Training Accuracy: 0.272795 Validation Accuracy: 0.261078
Validation loss decreased (3.020443 --> 3.009181) Saving model as model_scratch_val...
Epoch: 253 Training Loss: 3.011675 Validation Loss: 2.989468
Epoch: 253 Training Accuracy: 0.266657 Validation Accuracy: 0.249102
Epoch: 254 Training Loss: 2.990109 Validation Loss: 2.995637
Epoch: 254 Training Accuracy: 0.275790 Validation Accuracy: 0.255090
Epoch: 255 Training Loss: 3.000075 Validation Loss: 3.015027
Epoch: 255 Training Accuracy: 0.270849 Validation Accuracy: 0.262275
Epoch: 256 Training Loss: 3.001055 Validation Loss: 3.037740
Epoch: 256 Training Accuracy: 0.274891 Validation Accuracy: 0.250299
Epoch: 257 Training Loss: 2.995479 Validation Loss: 2.974404
Epoch: 257 Training Accuracy: 0.275191 Validation Accuracy: 0.263473
Epoch: 258 Training Loss: 3.010075 Validation Loss: 2.966813
Epoch: 258 Training Accuracy: 0.266657 Validation Accuracy: 0.247904
Epoch: 259 Training Loss: 2.971134 Validation Loss: 2.957285
Epoch: 259 Training Accuracy: 0.275341 Validation Accuracy: 0.264671
Validation loss decreased (3.009181 --> 2.996492) Saving model as model_scratch_val...
Epoch: 260 Training Loss: 2.990199 Validation Loss: 3.008039
Epoch: 260 Training Accuracy: 0.265309 Validation Accuracy: 0.265868
Accuracy has increased (0.264671 --> 0.265868) Saving model as model_scratch_acc...
Epoch: 261 Training Loss: 2.975455 Validation Loss: 2.932802
Epoch: 261 Training Accuracy: 0.270400 Validation Accuracy: 0.263473
Validation loss decreased (2.996492 --> 2.987573) Saving model as model_scratch_val...
Epoch: 262 Training Loss: 2.961632 Validation Loss: 2.984690
Epoch: 262 Training Accuracy: 0.277437 Validation Accuracy: 0.267066
Accuracy has increased (0.265868 --> 0.267066) Saving model as model_scratch_acc...
Epoch: 263 Training Loss: 2.970838 Validation Loss: 2.983090
Epoch: 263 Training Accuracy: 0.274293 Validation Accuracy: 0.262275
Validation loss decreased (2.987573 --> 2.975762) Saving model as model_scratch_val...
Epoch: 264 Training Loss: 2.961947 Validation Loss: 2.995247
Epoch: 264 Training Accuracy: 0.273394 Validation Accuracy: 0.261078
Epoch: 265 Training Loss: 2.962421 Validation Loss: 2.943192
Epoch: 265 Training Accuracy: 0.273993 Validation Accuracy: 0.268263
Accuracy has increased (0.267066 --> 0.268263) Saving model as model_scratch_acc...
Epoch: 266 Training Loss: 2.955692 Validation Loss: 2.983354
Epoch: 266 Training Accuracy: 0.283276 Validation Accuracy: 0.256287
Validation loss decreased (2.975762 --> 2.971960) Saving model as model_scratch_val...
Epoch: 267 Training Loss: 2.961222 Validation Loss: 2.941499
Epoch: 267 Training Accuracy: 0.277736 Validation Accuracy: 0.274251
Accuracy has increased (0.268263 --> 0.274251) Saving model as model_scratch_acc...
Epoch: 268 Training Loss: 2.943230 Validation Loss: 3.016429
Epoch: 268 Training Accuracy: 0.280731 Validation Accuracy: 0.256287
Epoch: 269 Training Loss: 2.966477 Validation Loss: 3.001827
Epoch: 269 Training Accuracy: 0.280431 Validation Accuracy: 0.268263
Epoch: 270 Training Loss: 2.947989 Validation Loss: 2.944552
Epoch: 270 Training Accuracy: 0.275490 Validation Accuracy: 0.273054
Validation loss decreased (2.971960 --> 2.967297) Saving model as model_scratch_val...
Epoch: 271 Training Loss: 2.941188 Validation Loss: 2.969306
Epoch: 271 Training Accuracy: 0.282677 Validation Accuracy: 0.262275
Epoch: 272 Training Loss: 2.948247 Validation Loss: 2.917534
Epoch: 272 Training Accuracy: 0.280731 Validation Accuracy: 0.263473
Validation loss decreased (2.967297 --> 2.954233) Saving model as model_scratch_val...
Epoch: 273 Training Loss: 2.933625 Validation Loss: 2.938378
Epoch: 273 Training Accuracy: 0.282677 Validation Accuracy: 0.275449
Accuracy has increased (0.274251 --> 0.275449) Saving model as model_scratch_acc...
Validation loss decreased (2.954233 --> 2.949634) Saving model as model_scratch_val...
Epoch: 274 Training Loss: 2.937973 Validation Loss: 2.960940
Epoch: 274 Training Accuracy: 0.283725 Validation Accuracy: 0.271856
Epoch: 275 Training Loss: 2.931550 Validation Loss: 2.950349
Epoch: 275 Training Accuracy: 0.284923 Validation Accuracy: 0.281437
Accuracy has increased (0.275449 --> 0.281437) Saving model as model_scratch_acc...
Epoch: 276 Training Loss: 2.923791 Validation Loss: 3.044055
Epoch: 276 Training Accuracy: 0.283725 Validation Accuracy: 0.265868
Epoch: 277 Training Loss: 2.913700 Validation Loss: 2.938901
Epoch: 277 Training Accuracy: 0.280731 Validation Accuracy: 0.282635
Accuracy has increased (0.281437 --> 0.282635) Saving model as model_scratch_acc...
Validation loss decreased (2.949634 --> 2.942584) Saving model as model_scratch_val...
Epoch: 278 Training Loss: 2.925501 Validation Loss: 2.898729
Epoch: 278 Training Accuracy: 0.279533 Validation Accuracy: 0.276647
Validation loss decreased (2.942584 --> 2.939137) Saving model as model_scratch_val...
Epoch: 279 Training Loss: 2.893425 Validation Loss: 2.954740
Epoch: 279 Training Accuracy: 0.293457 Validation Accuracy: 0.282635
Validation loss decreased (2.939137 --> 2.931625) Saving model as model_scratch_val...
Epoch: 280 Training Loss: 2.916728 Validation Loss: 2.930152
Epoch: 280 Training Accuracy: 0.285522 Validation Accuracy: 0.282635
Epoch: 281 Training Loss: 2.900486 Validation Loss: 2.896766
Epoch: 281 Training Accuracy: 0.281779 Validation Accuracy: 0.273054
Epoch: 282 Training Loss: 2.909466 Validation Loss: 2.901819
Epoch: 282 Training Accuracy: 0.290163 Validation Accuracy: 0.273054
Validation loss decreased (2.931625 --> 2.925920) Saving model as model_scratch_val...
Epoch: 283 Training Loss: 2.879655 Validation Loss: 2.953244
Epoch: 283 Training Accuracy: 0.287318 Validation Accuracy: 0.275449
Epoch: 284 Training Loss: 2.910829 Validation Loss: 2.893557
Epoch: 284 Training Accuracy: 0.286121 Validation Accuracy: 0.292216
Accuracy has increased (0.282635 --> 0.292216) Saving model as model_scratch_acc...
Validation loss decreased (2.925920 --> 2.923718) Saving model as model_scratch_val...
Epoch: 285 Training Loss: 2.898269 Validation Loss: 2.907529
Epoch: 285 Training Accuracy: 0.295553 Validation Accuracy: 0.277844
Epoch: 286 Training Loss: 2.892020 Validation Loss: 2.905437
Epoch: 286 Training Accuracy: 0.297350 Validation Accuracy: 0.276647
Validation loss decreased (2.923718 --> 2.921438) Saving model as model_scratch_val...
Epoch: 287 Training Loss: 2.888301 Validation Loss: 2.949574
Epoch: 287 Training Accuracy: 0.286869 Validation Accuracy: 0.267066
Epoch: 288 Training Loss: 2.871979 Validation Loss: 2.915198
Epoch: 288 Training Accuracy: 0.301093 Validation Accuracy: 0.276647
Validation loss decreased (2.921438 --> 2.909799) Saving model as model_scratch_val...
Epoch: 289 Training Loss: 2.872260 Validation Loss: 2.917272
Epoch: 289 Training Accuracy: 0.290313 Validation Accuracy: 0.270659
Validation loss decreased (2.909799 --> 2.904593) Saving model as model_scratch_val...
Epoch: 290 Training Loss: 2.884797 Validation Loss: 2.890345
Epoch: 290 Training Accuracy: 0.289564 Validation Accuracy: 0.286228
Validation loss decreased (2.904593 --> 2.898360) Saving model as model_scratch_val...
Epoch: 291 Training Loss: 2.865433 Validation Loss: 2.882917
Epoch: 291 Training Accuracy: 0.298697 Validation Accuracy: 0.273054
Epoch: 292 Training Loss: 2.862329 Validation Loss: 2.926948
Epoch: 292 Training Accuracy: 0.296152 Validation Accuracy: 0.283832
Epoch: 293 Training Loss: 2.888048 Validation Loss: 2.889356
Epoch: 293 Training Accuracy: 0.291660 Validation Accuracy: 0.286228
Validation loss decreased (2.898360 --> 2.893616) Saving model as model_scratch_val...
Epoch: 294 Training Loss: 2.849733 Validation Loss: 2.882064
Epoch: 294 Training Accuracy: 0.298099 Validation Accuracy: 0.281437
Epoch: 295 Training Loss: 2.876844 Validation Loss: 2.900681
Epoch: 295 Training Accuracy: 0.289864 Validation Accuracy: 0.271856
Epoch: 296 Training Loss: 2.860848 Validation Loss: 2.897152
Epoch: 296 Training Accuracy: 0.297200 Validation Accuracy: 0.279042
Epoch: 297 Training Loss: 2.860211 Validation Loss: 2.868611
Epoch: 297 Training Accuracy: 0.293158 Validation Accuracy: 0.282635
Validation loss decreased (2.893616 --> 2.883212) Saving model as model_scratch_val...
Epoch: 298 Training Loss: 2.856004 Validation Loss: 2.838748
Epoch: 298 Training Accuracy: 0.294206 Validation Accuracy: 0.285030
Epoch: 299 Training Loss: 2.843515 Validation Loss: 2.868899
Epoch: 299 Training Accuracy: 0.294805 Validation Accuracy: 0.279042
Epoch: 300 Training Loss: 2.841376 Validation Loss: 2.799439
Epoch: 300 Training Accuracy: 0.300344 Validation Accuracy: 0.274251
Validation loss decreased (2.883212 --> 2.878668) Saving model as model_scratch_val...
Training complete in 451m 40s
Best validation accuracy: 0.292216
Best accuracy epoch : 284
Best validation loss : 2.878668
Best validation epoch : 300